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Abstract 


This  thesis  outlir.es  the  results  of  research  in*  o  chess 
playing  program  evaluation.  A  methodology  for 
experimentally  determining  chess  heuristic  performance  is 
described.  It  is  based  on  statistical  analysis  of  heuris*:c 
"reactions"  to  pre-scored  moves.  Application  of  this 
methodology  resulted  in  the  creation  of  a  library  of  master 
games  and  the  evaluation  of  the  components  of  a  specific 
chess  Drogram,  CHUTE.  The  results  identified  problem  areas 
in  the  program;  for  example,  initial  move  selection  missed 
the  "ideal"  move  69%  of  the  time. 

An  analysis  of  error  bounds  cn  lookahead  strategy  and 
move  selection  is  also  presented.  This  analysis  provided 
the  basis  for  the  interpretation  of  the  experimental 
results. 
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1.0  INTRODUCTION 


The  goal  cf  cognitive  mechanics  research  is  to  produce 
machines  that  display  intelligence.  The  game  of  chess  is  an 
ideal  testbed  for  this  research.  It  is  a  game  requiring 
considerable  intelligent  decision  making,  yet  it  is  also  a 
game  of  perfec*  information,  where  chance  is  not  a  factor  in 
the  outcome  of  decisions.  Modelling  the  "physical  universe" 
of  a  chess  game  is  relatively  simple,  hence  we  can  be 
certain  tha+  machine  performance  is  not  hindered  due  to  the 
lack  of  primitive  sensory  mechanisms.  But  most  importantly, 
chess  is  a  challenge  to  human  intelligence  and,  as  such,  any 
insiaht  gained  in  developing  a  chess  playing  machine  may 
provide  a  better  understanding  of  ourselves. 

This  thesis  presents  our  research  into  computer  chess. 
We  have  not  *aken  it  upon  ourselves  to  produce  yet  another 
chess  program.  Rather,  we  explored  one  particular  aspect  of 
the  science  of  chess  programs.  We  examined  the  problem  of 
chess  program  evaluation. 

In  this  chapter,  we  will  outline  the  background  of  the 
research  and  the  general  approaches  taken.  First,  we 
present  a  discussion  of  the  specific  problem  areas  tha^  we 
addressed.  Nex+,  we  give  a  short  historical  background  of 
commuter  chess  and  discuss  the  advances  made  so  far.  Third, 
we  outline  our  approach  to  the  problem  and  sketch  cur 
accomplish men t s.  Finally,  we  give  an  outline  and  guide  o 
the  details  presented  in  the  rest  of  the  thesis. 


1.1  Statement  of  Problem 


The  realization  of  a  chess  program  is  a  very  broad 
problem  domain.  As  we  shall  see  later,  the  problem  of 
creating  a  program  capable  of  playing  "legal”  chess  is  not 
difficult;  the  harder  problem  of  creating  one  that  plays 
"good"  chess  has  also  been  examined  by  many  people.  We 
decided  to  approach  things  from  a  slightly  different 
perspecti ve. 

We  focused  on  the  problem  that  makes  computer  chess  of 
interes-1-;  How  does  one  improve  on  the  play  of  a  chess 
program?  The  problem  is  not  only  a  guestion  of  what  one  has 
to  do  to  fix  specific  problems,  but  also  how  one  finds  and 
defines  these  problems.  If  someone  is  confronted  with  a  very 
complex  piece  of  machinery  that  is  supposedly  not 
functioning  correctly,  then  in  order  to  fix  it  one  must  know 
how  it  works  and  what  can  go  wrong.  The  problem  is  even 
worse  with  chess  programs,  since  the  meaning  of  "correct 
functioning"  is  ill-defined.  A  chess  program  miqht  play  the 
game  of  chess  within  the  rules  of  chess,  but  our  definition 
of  its  correct  functioning  is  predicated  on  it  "playing 
well",  i.e.,  winning  (or  at  least  not  losing)  against  strong 
opponents. 

So  the  problem  cf  improvement  of  chess  programs  can  be 
divided  into  parts.  First,  we  must  understand  +he 
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functioning  of  the  program,  not  only  the  basis  of  its 
design,  but  also  the  things  that  can  go  wrong.  We  must 
understand  how  the  individual  components  interact  and  how 
errors  are  propagated.  Second,  we  must  have  a  reliable 
measure  of  performance  of  the  program.  The  starting  point  in 
improving  anything  is  to  know  how  well  it  already  works,  and 
we  must  be  able  to  detect  the  effects  of  changes.  finally, 
there  is  the  problem  of  what  to  change  in  order  to  realize 
improvements.  Effective  solutions  to  the  first  two  problems 
will  lead  us  to  the  changes  most  likely  to  produce 
improvement. 


The  complexity  of  chess  programs  dictates  the  need  to 
deal  with  specific  parts  at  any  one  time  in  isolation  from 
the  rest  of  the  program.  Hence  the  problems  exist  for  each 
of  these  parts  as  well  as  ■‘■he  interaction  of  these  parts.  In 
other  words,  we  must  be  able  to  measure,  understand,  and 
change  the  parts  separately.  Lack  of  this  facility  will  mean 
that  attempts  to  fix  a  program  will  too  often  result  in 
unwanted  and  unforeseen  side  effects. 

1 . 2  Background 

We  now  present  an  outline  of  the  structure  employed  by 
most  writers  of  chess  programs.  P  discussion  of  the  general 
history  of  chess  programs  follows  this  outline. 


-  a-- 


1.2.1  Chess  Placing  Programs 

An  algorithm  for  computer  chess  was  outlined  by  Shannon 
in  1950  [Shannon50],  It  consists  mainly  of  the  construction 
of  a  tree  of  moves  and  positions,  moves  being  represented  by 
■‘-.he  edges  and  positions  being  represented  by  the  nodes.  Such 
a  ^ree  has  as  its  top  node  the  position  that  is  currently 
confronting  the  program  and  as  the  first  level  edges  the 
moves  that  are  possible  from  that  position.  The  tree  is 
Generated  to  a  given  depth  and  width,  and  the  terminal 
nodes  are  evaluated  according  to  some  criteria.  The  tree  is 
now  searched  to  see  which  of  the  first  level  moves  allow  the 
most  favourable  terminal  position  to  be  reached. 

The  reason  for  having  a  lookahead  tree  is  found  in  the 
complexity  of  most  chess  positions.  Static  evaluation  of 
certain  aspects  of  a  position  (  e.g.,  captures,  forks,  pins, 
e-^c.)  is  usually  very  difficult.  I*  is  easier  to  examine  the 
possible  consequences  from  such  a  position  by  means  of  the 
lookahead.  This  means  we  'play  out'  the  position  and  assess 
rhe  results.  It  is  usually  the  case  that  the  conseguer.ces 
of  complex  situations  are  simpler  situations.  Per  instance, 
we  find  that  complicated  exchange  situations  will  result  in 
easily  definable  material  imbalances.  The  lookahead  tree 
allows  us  to  examine  complicated  positions  with  more 


conf idence 
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Th=>  implementation  of  the  general  algorithm  entails 
programs  that  can  generate  all  legal  moves  from  a  given 
position.  We  must  have  suitable  data  structures  for 
representing  the  board  and  the  tree,  and  an  evaluation 
function  that  can  rank  the  terminal  positions.  Finally,  we 
need  a  tree  generation  and  search  method.  There  ar»  then 
further  complications  that  we  encounter  due  to  +he 
exponential  growth  of  such  a  tree. 

In  a  typical  position  in  midgame,  we  can  encounter  abou+ 
35  legal  moves  for  the  continuation.  If  this  is  true  of  the 
next  5  positions,  then  if  the  game  tree  is  expanded  *o  5  ply 
(  a  ply  is  one  move  by  one  of  the  players)  then  the  tree 
size  will  be  in  the  order  of  355  (approximately  5*1C7).  The 
search  time  of  such  a  tree  is  prohibitively  long.  The  depth 
to  which  we  expand  the  tree  is  also  a  factor  in  selecting 
the  correct  continuation.  Ideally,  the  tree  would  be 
expanded  to  the  end  of  the  game.  The  scores  at  the  terminal 
positions  could  be  used  to  limit  the  next  move  to  those  that 
will  always  lead  to  the  most  favourable  outcome  (a  win, if 
possible,  otherwise,  a  draw).  However,  the  problem  of 
exponential  growth  again  rules  cut  this  size  of  tree.  Ir. 
order  to  limit  the  size  of  the  tree  and  still  allow  a 
reasonable  depth  of  lookahead,  we  must  limit  the  number  of 
moves  we  examine  from  each  position.  This  can  be 


a  static  evaluator  to  select  the  "most 


accomplished  using 
promising"  moves  for  further  lookahead. 

There  are  two  situations  during  tree  search  when  we  need 
static  evaluation.  The  first  is  in  limiting  the  number  of 
moves  to  be  examined  and  the  second  is  in  assigning  scores 
to  terminal  nodes  for  back-up.  These  two  functions  can  be 
performed  by  two  distinct  static  evaluators  or  by  a  single 
one.  Although  these  functions  are  similar,  a  static 
evaluator  that  is  good  at  evaluating  one  need  not  be  as  aood 
at  evaluating  the  other.  ^or  our  purposes,  we  will  assume 
that  we  have  only  one  static  evaluator  that  performs  both  of 
these  functions. 


The  general  static  evaluator  consists  of  seve  ral 
heuristics  whose  combined  "opinion"  of  the  moves  determines 
the  move  ranking.  These  heuristics  consider  such  factors  as 
material  balance,  development,  attacks,  etc.,  and  grade  the 
move  with  respect  to  these  factors.  The  relative  importance 
of  these  factors  determines  the  weighting  given  to  the 
corresponding  heuristic's  score.  The  weighted  sum  of  the 
heuristic  values,  the  scoring  polynomial,  determines  the 
score  of  the  move.  Some  programs  have  static  evaluators  that 
evaluate  positions  as  opposed  to  moves.  This  has  the  same 
overall  effect,  however.  Whether  we  select  a  move  because  it 
has  the  highest  score  or  because  it  leads  to  a  position  with 
the  highest  score  is  immaterial. 


The  static  evaluator  is  of  prime  importance  to  •‘■he 
success  of  a  chess  program.  The  decision  of  which  mcves  to 
consider  at  the  firs4:  ply  is  just  as  crucial  as  the  result 
of  the  tree  search.  If  the  best  possible  move  is  rot  ranked 
among  the  moves  to  be  expanded,  then  the  bes4:  terminal 
position  may  not  be  reached.  Pence  the  program  is 
handicapped  by  selecting  a  move  which  cannot  possibly  lead 
to  the  best  final  result.  For  more  details  on  this  and  other 
problems  associated  with  the  general  algorithm,  the  reader 
is  referred  to  r Valent i74  ]  and  [Newborn75], 

The  chess  program  designer  is  then  faced  with  a  design 
trade-off  situation:  he  can  devote  processing  time  -“re  a 
very  sophisticated  static  evaluator  and  severely  limit  •'■he 
free  size,  or  he  can  employ  a  very  primitive  evaluator  and 
spend  more  time  in  a  larger  tree. 

1.2.2  Chess  Program  History 

The  fundamental  paper  describing  the  general  chess 
playing  algorithm  was  published  by  Shannon  in  Philosophical 
Magazine,  1950.  Similar  schemes  were  also  presented  by 
Turing  and  Wiener.  Shannon  discussed  the  minimax  algorithm, 
which  is  the  basic  method  of  backing  up  the  score  of  the 
terminal  position  to  the  first  ply  in  order  to  selec*  a 
move.  The  evaluation  functions  that  he  proposed  were  based 
on  three  factors:  material  advantage,  pawn  structure,  and 
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mobility.  The  first  published  attempt  at  implementation  of 
Shannon’s  ideas  came  in  1957  with  the  Kister  program 
[Kister57],  This  was  a  program  that  played  miniature  chess, 
i.e.  ,  on  a  6x6  board  with  no  bishops.  In  1958  Eerstein 
reported  on  his  full  chess  playing  program.  His  evaluation 
function  was  based  on  material  advantage,  mobility,  area 
control,  and  king  defence.  A.  departure  from  the  standard 
type  of  evaluation  was  made  by  Newell  et  al.  [Newell63]  in 
that  the  scoring  was  based  on  a  set  of  goals  and  how  well  a 
move  achieved  these  goals.  Newell  also  proposed  that 
computer  chess  play  would  be  greatly  enhanced  if  the  studies 
of  human  chess  playing  could  be  applied  to  the  design  of 
chess  programs  [Newell64],  Zorbrist  and  Carlson  [Zorbrist73] 
extended  on  this  idea.  like  Newell,  they  realized  that  one 
needs  to  impart  human  experience  to  a  chess  program.  They 
achieve  ■‘■his  by  having  a  chess  master  teach  their  program, 
using  generalized  position  patterns  to  give  advice  to  *he 
plausible  move  selection  routines. 


0 

Green 

t  he 

used 

Looka 

graph 

redun 


ne  of  the  most  successful 
blatt  program  r Green blatt67  ]. 
program  are  its  rich  and  ef 
in  the  static  evaluator,  and 
head  is  based  not  simply  on 
.  Each  position  is  stored  aft 
dant  evaluations.  This  is 


chess  progra 
The  noteworthy 
fective  set  of 
its  lookahead 
a  tree,  but  on 
er  evaluation 
different  from 


ms  is  the 
points  of 
heurist ics 
strategy, 
a  directed 
tc  avoid 
the  normal 


—  9— 


tree  search  strategy,  where  nodes  are  discarded  as  they  are 
examined.  Some  50  heuristics  contribute  to  the  scoring  of  a 
move  in  a  dynamic  manner.  Not  all  heuristics  are  valid  for 
a  given  move.  The  choice  of  which  are  applied  is  controlled 
by  higher  level  heuristics. 

Most  of  the  current  chess  proarams  are  variations  on  the 
same  basic  theme  wi+h  individual  areas  of  stress.  COKC 
[Cooper73],  for  instance,  has  very  sophisticated  search 
techniques  and  tree  pruning  algorithms,  while  Chess  3.5 
[Newborn75]  concentrates  on  the  evaluation  function  and 
employs  fairly  small  trees.  A.n other  successful  program, 
TECH  [ Gillogly72  ] ,  employs  vast  trees,  evaluating  as  many  as 
500,000  end  positions,  but  the  position  evaluation  is 
rudimentary.  One  of  the  common  complaints  about  most  of 
these  programs  is  the  lack  of  chess  strategy  exhibited  by 
their  play.  While  they  are  tactically  sound,  they  are  weak 
in  overall  direction  of  play.  It  would  appear  that 
improvements  within  the  basic  Shannon  structure  lead  only  to 
marginal  gains  over  the  current  state  of  the  art.  Extensive 
histories  of  chess  programs  are  given  in  [Newborn75]  and 
[ Mcullen68  ]. 

1.2.3  Samuel^s  Checkers  Program 

One  game  playing  program  that  heavily  influenced  this 
thesis  is  Samuel's  checkers  proaram  [ Samuel6 3  If Samuelf 7  ]. 
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His  work  is  relevant  for  our  purposes  not  for  +  he 
heuristics,  per  se,  but  rather  for  the  methodology  employed 
in  the  development  of  the  scoring  polynomial  and  for  the 
evaluation  of  the  effectiveness  of  the  heuristics  used. 
Unlike  most  chess  playing  programs,  whose  evaluation 
consists  primarily  of  actual  game  play,  Samuel's  program  was 
experimentally  evaluated.  The  experience  that  the  program 
gained  by  actual  game  play  was  applied  in  a  systematic 
manner  towards  the  improvement  of  the  program.  The  emphasis 
of  Samuel's  work  was  on  machine  learning. 

The  original  structure  of  Samuel's  program  was  similar 
to  Shannon's  proposal,  namely,  a  tree  search  directed  by 
evaluation  functions.  In  this  regard,  the  program  was  very 
similar  to  most  chess  programs.  The  heuristics  were,  guite 
naturally,  checkers  dependent,  but  the  score  for  a  move  was 
determined  in  a  familiar  way.  Moves  were  scored  using  a 
scoring  polynomial  composed  of  weighted  heuristic  scores.  It 
is  the  way  this  scoring  polynomial  was  developed  that  is  of 
crucial  importance. 

One  of  the  experiments  Samuel  conducted  was  the 
comparison  of  his  program  to  checkers  games  of  master  play. 
He  constructed  a  library  of  such  games,  and  then  had  his 
program  rank  the  possible  moves  for  each  position  of  each 
game.  Thus,  for  any  given  position,  he  had  a  list  of  moves 
M 1  ,M2 , M3. . . Mn  as  ranked  by  his  program.  He  then  compared  the 
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proximity  of  the  ratings  of  the  moves  and  calculated  a 
correlation  coefficient  as 

C  =  (L-H)  /  (L  +  H) 

where  L  is  the  total  number  of  moves  judged  by  the  program 
as  poorer  than  the  master  move  and  H  is  the  total  number  of 
moves  judged  better  than  the  master  move.  Thus  a  score  of  1 
means  that  the  master  move  was  the  same  as  Ml,  while  a  score 
of  -1  means  that  the  master  move  was  the  same  as  Mr.. 
Assuming  that  the  master's  move  is  the  optimal  move,  this 
method  can  give  a  good  measure  of  the  performance  of  the 
scoring  polynomial. 


Another 

experiment 

of  importance 

was 

a  learn 

ir.g 

experiment. 

whereby  the 

scoring  polynomial 

was 

dynamica 

iiy 

improved  as 

the  game 

progressed.  The 

static 

score 

was 

compared  to  the  backed-up  score  derived  through  the 
lookahead  in  the  tree.  The  weightings  were  adjusted  to  try 
and  improve  the  agreement  between  the  two  scores. 

A  similar  experiment  compared  two  versions  of  the 
program,  alpha  and  beta.  Alpha  would  play  beta  for  a  number 
of  games  until  alpha  could  consistently  beat  beta.  During 
this  time  alpha  would  juggle  its  heuristic  weightings  and 
♦he  actual  heuristics  used  in  the  polynomial  in  order  to 
achieve  a  better  play  than  beta.  Meanwhile  beta's  scoring 
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outlined  methods  and  much  "hand  analysis"  and  consu lta4-  ion 
with  masters  produced  a  checkers  program  whose  successes 
against  human  opponents  place  it  in  the  master  category. 
This  feat  surpasses  *he  achievements  of  any  chess  program  +c 
da*ef  although  we  must  not  overlook  the  fact  that  checkers 
is  a  much  simpler  game  than  chess.  It  must  be  further  noted 
that  checkers  masters  play  "nearly  perfect"  checkers,  i.e., 
after  a  loss,  the  masters  will  be  able  to  agree  on  the 
single  mistake  tha4-  caused  the  loss,  and  more  than  three 
quarters  of  the  games  end  in  a  draw.  Samuel  feels  that  there 
will  be  a  champion  chess  program  before  there  is  a  champion 
checkers  program. 

1 . 3  Attack  on  Problem 

We  now  present  an  overview  of  our  approach  of  the 
research  and  describe  the  environment  of  our  study. 


1.3.1 


Approach 


In  the  firs4-,  step  of  the  research,  we  collected  a  data 
base  of  master  level  games.  These  games  served  as  the 
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standard  of  play  against  which  we  could  measure  the 
performance  of  a  chess  program.  In  the  next  step,  w& 
gathered  the  data  on  program  performance,  derived  from  the 
reactions  of  a  chess  program  to  the  games  in  the  came 
library.  We  then  did  extensive  statistical  analysis  on  +  he 
derived  data,  and  produced  an  evaluation  of  the  program 
components. 

An  analysis  of  the  errors  in  the  general  lookahead 
strategy  revealed  that  the  problems  of  heuristic  efficiency 
and  weighting  wer°  oermane  to  aood  play.  Because  of  this, 
most  of  the  data  on  performance  reflects  the  results  of 
heuristic  scoring. 

As  a  separate  issue,  we  also  examined  the  feasibility  of 
changing  the  search  strategy.  However  this  line  of  research 
was  not  pursued  +o  any  definite  conclusions. 

1.3.2  Environment 

In  order  to  develop  performance  measures  for  ch.e-ss 
programs  in  general,  we  needed  an  actual  chess  program  and  a 
data  base  of  master  oames.  we  were  fortunate  -*-o  have  access 
to  a  chess  program  named  CHUTE.  This  program  was  written  a-1- 
•the  University  of  Toronto  r  Va l^n-1-  i"7 u  ] . 

CHUTE  was  developed  by  M .  Valenti  as  a  Easter's  Thesis. 
The  goal  of  his  work  was  to  produce  a  chess  program  written 
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in  such  a  way  that  it  could  be  used  as  a  test  bed  for 
further  research  into  chess  programs.  It  was  written  in  a 
higher  level  language  called  EPI  (an  XPL  dialect;  see 
f  McKeeman70])  ,  and  follows  the  dictates  of  structured 
programming.  The  loaical  structure  of  the  program  is  similar 
to  that  of  most  other  chess  programs  (namely  Shannon's 
structure).  It  has  a  scoring  polynomial,  consisting  of  22 
heuristics,  and  the  search  of  the  tree  is  minimax  with 
alpha-beta  tree  pruning.  There  is  a  small  library  of  book 
openings,  whose  distinctive  feature  is  that  the  line  of 
opening  tries  to  take  the  opponent  "out  of  book"  as  guickly 
as  possible. 

The  program  plays  a  reasonable  game,  due  mainly  to  the 
chess  advice  given  by  Professor  Z.  G.  Vranesic,  an 
International  Chess  Master,  to  CHUTE'S  author.  CHUTE  has 
competed  in  two  ACM  tournaments  (finishing  ninth  in  1974  and 
fifth  in  1975)  and  at  the  First  Canadian  Computer  Chess 
Championships.  Specific  games  can  be  found  in  Valenti's 
thesis. 

We  used  CHUTE  in  this  research  for  the  following 
reasons:  it  is,  as  claimed,  easily  modifiable,  hence 

modifications  for  the  purpose  of  scoring  master  games  could 
be  achieved  with  few  problems;  the  heuristics  used  in.  the 
program  appeared  extensive,  hence. 


suitable  for  a  good 
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analysis  of  the  master  games; 
on  the  computing  resources 
CHUTE  that  we  used  was  CHUTE1 
1  974. 


it  was  available  and  usable 
at  hand.  The  actual  version  of 
,  completed  in  the  fall  of 
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program.  S 
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ince  there  were  no  readily 
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library  includes  98  games  by  a 
ine,  Capablanca,  Fuwe,  Horowit 
tournament  play  and  hence 
ard  of  chess  skill.  The  libra 
in  the  University  of  Toronto 
brary  access,  see  Appendix  4) . 
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We  used  two  proaram  packages  to  do  the  statistical 
analysis  of  the  performance  data.  These  packages  wer^ 
SPSS (St at istical  Package  for  the  Social  Sciences) r Nie7C  ]. 
and  EHD (BioHedical  Computer  Programs)  [Dixon68],  Our  use  of 
these  packages  was  fairly  standard.  We  will  not  burden  the 
reader  with  the  details. 
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1 . 4  Accomplishments  of  Pesea rch 

The  main  contribution  of  this  thesis  is  the  development 
of  a  methodology  for  the  evaluation  of  chess  programs.  We 


believe 

that 

the  measures 

of  performance 

that 

we  descr 

ibe 

can  serve  as  a 

vital  tool  in 

the  development 

and 

improvem 

ent 

of  chess  progr 

ams. 

A. 

second 

contribution 

is  the  creation 

of  a 

library 

of 

master 

games. 

Such  a  library  is  necessary 

for 

the 

application  of 

the  measures. 

Finally,  we  have  evaluated  an  existing  program.  This 
demon st rates  the  feasibility  and  applicability  of  the 
m eth  odoloay. 

1 • 5  Structure  of  the  Thesis 

In  the  next  chapter,  we  provide  a  description  of  the 


theoretical 

basis  of  chess 

programs . 

We 

also  examin 

e 

sources  of 

error  in  the 

st  andard 

chess 

strategy.  We 

cl 

the  chapter 

with  the  presentation  of 

the 

theoretical 

ba 

of  our  methods. 

The  third  chapter  describes  in  detail  the  methods  that 
we  used  in  the  research.  Included  is  information  on  how  we 
generated  the  game  library,  how  we  gathered  the  performance 
data,  and  how  we  analyzed  it. 
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The  fourth  chapter  contains  the  results  of  the  research. 
These  results  include  our  experience  with  the  master  game 
library  generation,  the  data  gathering,  and  the  evaluation 
of  CHUTE.  We  end  the  chapter  with  a  discussion  and 
interpretation  of  the  results  and  their  impact  on  CHUTE  and 
the  evaluation  methodology. 

The  final  chapter  contains  our  opinions  on  the 
directions  future  work  may  fake.  Included  are  both  general 
and  CHUTE-specific  topics. 
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2.0  THPOFY 


Th°  construction  of  chess  programs  is  not  totally  an  ad 
hoc  activity.  There  is  a  body  of  theory  about  game  playing 
from  which  the  chess  programmer  can  draw.  The  richest  body 
of  knowledge  in  this  theory  is  in  the  area  of  lookahead 
strategy.  Unfortunately,  most  of  the  chess-specif ic 
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Tree  Search 


The  motivation  for  having  the  game  tree,  as  we  discussed 
earlier,  is  found  in  the  need  to  ’untangle'  complex 
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positions.  We  now  present  methods  that  can  be  used  to 
generate  and  search  game  trees. 

There  is  first  the  minimax  algorithm  proposed  by  Shannon 
in  his  original  descriptions  of  the  problems  of  chess  play 
by  machine.  This  algorithm  provides  a  way  in  which  the 
knowledge  obtained  by  lookahead  can  be  applied  to  the 
problem  of  deciding  the  next  move  from  the  current  position. 

There  is  a  basic  assumption  about  the  environment  of  the 
analysis.  We  take  for  granted  the  infallibility  of  the 
opponent:  we  assume  that  he (it)  will  always  make  the 
optimal  reply.  Assuming  otherwise  will  most  certainly  get 
the  program  into  trouble  if  it  plays  anybody  with  master 
level  skill.  This  basic  assumption  leads  to  some 
definitions  about  the  line  of  play. 

We  define  the  ideal  continuation  for  a  position  to  be  a 
seguence  of  moves  that  lead  to  the  best  possible  conclusion 
of  the  qame  from  that  position.  The  factors  that  determine 
the  best  possible  conclusion  include  considerations  such  as 
the  length,  as  well  as  the  outcome,  of  the  game.  In  other 
words,  it  is  better  to  win  in  5  ply  than  in  11  ply  and  it  is 
better  to  lose  in  11  ply  than  5  ply  (if  a  loss  is 
inevitable).  Since  the  replies  in  the  ideal  continuations 
are  always  the  best  possible  ones,  the  moves  taken  must  also 
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be 

the 

best 
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ord 

er 
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best 
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We 
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algorithm.  In  the  first  step,  we  generate  the  tree  of 
possibilities  and  invoke  a  static  evaluator  to  score  the 
terminal  nodes.  We  now  "back-up”  a  score  for  the  root  node 
from  the  terminal  nodes  in  the  following  way:  Given  an 
internal  node  in  the  tree,  if  it  represents  a  position  where 
it  is  the  computer's  turn  to  move,  then  the  score  of  that 
node  is  the  maximum  of  the  scores  of  the  node's  immediate 
successors;  if  the  node  represents  the  opponent's  turn,  then 
its  score  is  the  mininLUl  of  the  scores  of  the  node's 
immediate  successors. 

After  all  scores  are  backed  up  to  the  root,  we  can 
select  a  move.  This  move  is  the  one  that  leads  to  the  node 
at  the  second  level  of  the  tree  that  has  the  highest  score; 
The  path  through  the  tree  that  leads  to  the  terminal  node 
whose  score  was  finally  assigned  to  the  root  node  is  called 
the  Erinci.pal  continuation. 

We  now  present  an  example  to  illustrate  the  operation  of 
this  algorithm.  Consider  the  following  game  tree,  taken  *o 
a  depth  of  2  ply: 
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I  I  1 

I  I  I 

r  B  ““-i  r ~ C  r--D  i 

I  111!  i 

E=  1 6  F=  1  G=  1  5  H  =-  3  1=7  J=  1  5 

^TG.  2. 1  Exa  EEl  e  Game  Tree 

The  nodes  A,E,...J  denote  positions  while  the  edges 
connecting  the  nodes  denote  the  moves  from  one  position  to 
another.  The  values  of  the  terminal  nodes  denote  the  score 
that  was  assigned  to  that  position  by  some  static  evaluator. 
The  more  positive  the  score  the  better  the  position  is  for 
the  computer.  Node  A  represents  the  position  that  is 
currently  facing  the  computer  and  the  edges  AE,  AC,  and  AD 
represent  all  the  possible  moves  that  could  be  made  from 
position  A.  The  next  level  of  edges  represent  all  *he 
possible  replies  that  the  opponent  could  make  to  any  move 
chosen  by  the  computer.  (In  an  actual  game,  the  number  of 
positions  in  the  tree  would  be  much  larger,  but  the  analysis 
generalizes  to  trees  of  greater  depth  and  width.) 

In  our  example,  we  would  first  back  up  the  scores  to  E,C 
and  D.  Since  these  nodes  represent  the  opponent's  possible 
moves,  we  take  the  minimum  of  the  successors,  hence  B  is 
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assigned  a  score  of  1,  C  gets  -3,  and  D  gets  7.  we  now  back¬ 
up  to  A  and  assign  it  the  score  that  is  the  maximum  of  its 
successors,  namely  7.  The  tree  now  looks  like  this: 


I  I  I 

r  B  =  1  --j  r -  C=-3  t  (--  D  =  7 

I  I  1  I  I  I 

F  =  16  F= 1  G= 1 5  H=- 3  1=7  J=15 

FIG.  2.2  Scored  Game  Tree 

So  we  now  know  that  the  best  move  to  make  is  the  one 
leading  to  position  D. 

If  we  analy2e  the  operation  of  the  algorithm,  we  see 
that  the  score  of  the  root  is  the  same  as  the  score  of  some 
particular  terminal  node.  This  terminal  node  is  reachable  if 
we  always  make  the  move  that  takes  us  to  the  highest  score 
a*  the  next  level  and  the  opponent  always  makes  the  reply 
that  takes  him  to  the  lowest  score  (for  us)  at  the 
subseguent  level. 

It  is  interesting  to  note  an  apparent  paradox  in  this 
method:  we  are  guaranteed  of  choosing  the  strongest  move  by 


minimax  if  the  static  evaluation  at  the  terminal  nodes  is 
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perfect.  But  if  the  static  evaluation  were  perfect,  w 
just  as  well  apply  the  static  evaluation  to  the  sue 
of  the  root  node  and  not  have  to  bother  with  more  ex 
lookahead.  So,  if  we  don’t  have  perfect  static  eval 
why  do  we  still  rely  so  heavily  on  the  correctness 
backed-up  score? 

The  answer  to  this  question  was  outlined  in 
1.2.1:  we  assume  that  the  static  evaluation  will  b 
accurate  in  assessing  the  possible  consequences 
position  than  in  assessing  a  position  directly.  Th 
backed-up  score  represents  a  more  reliable  evaluation 
position  than  is  possible  with  direct  static  s 
Hence,  it  is  because  we  donj_t  have  perfect 
evaluations  that  we  use  lookahead.  Also,  lookahead 
means  whereby  we  can  identify  and  avoid  obvious  blund 

2.1.2  Pruning 

To  limit  the  exponential  growth  of  the  game  t 
must  resort  +o  searching  selected  portions  of  the  tr 
way  to  do  this  is  to  £rune  the  tree  as  it  is  built, 
are  two  methods  of  pruning:  forward  pruning  and  b 
pruningf  Nilsson71  ],[  Newborn 75  j. 

Forward  pruning  occurs  when  we  limit  the  fanout  c 
as  the  tree  is  created  without  the  benefit  of  kn 
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about  nodes  at  a  lower  level.  For  instance,  if  the  current 
position  has  35  possible  continuations,  we  might  choose  to 
consider  only  10  of  them  for  inclusion  in  the  tree.  The 
choice  of  which  10  moves  to  include  is  based  on  a  static 
assessment  of  the  moves.  The  reliability  of  forward  pruning 
is  totally  dependent  on  the  accuracy  of  the  static 
evaluator . 


P.  backward  pruning  algorit 
unnecessary  in  a  bottom-up  fashio 
we  make  use  of  information  found 
alpha-beta  algorithm  is  an  exampl 
algorithm . 

The  alpha-beta  algorithm  i 
minimax  algorithm  that  increase 
search.  In  order  to  realize  this 
examine  the  game  tree  in  a  depth- 


hm 

finds 

branches  which  are 

n. 

In 

this  type  of  pruning 

at 

the 

te 

rminal  level.  The 

e 

of 

a 

backward  pruning 

s 

an  a 

ddition  to  th 

e 

basic 

c 

the 

efficiency 

of 

the 

ga 

in  in 

efficiency, 

we 

must 

first  manner. 


The  philosophy  behind  ■'■.his  algorithm  is  to  stop  scoring 
a  node  whenever  further  refinement  of  the  score  is  not 
necessary.  It  is  not  necessary  to  refine  the  score  when  we 
can  be  sure  that  the  score  will  not  be  backed  up  to  a  higher 
level  node.  For  instance,  if  a  node  is  to  be  scored  as  the 
maximum  of  its  successors,  then  we  need  an  accurate  score 
for  only  the  successor  with  the  maximum  score. 
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W e  apply  this  algorithm  in  two  forms;  one  for  the  e 
ply  nodes  and  the  other  for  the  odd  ply  nodes.  We 
outline  the  algorithm  for  the  even  ply  case. 


The  first  step  is  to  assign  a  temporary  value,  N,  to 
subject  node.  This  value  is  initially  the  score  of  one 
the  successors  of  the  subject  node.  If  this  successor  i 
terminal,  then  N  will  be  a  static  score.  If  the  succes 
is  not  a  terminal,  then  N  is  derived  through  the  recurs 
application  of  the  odd  ply  form  of  the  alpha-beta  algorit 
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The  odd  ply  form  of  this  algorithm  proceeds  as 
described  above,  with  the  obvious  reversals  of  sign  due  to 
the  minimax  strategy.  In  all  cases,  choice  of  which 
successor  to  score  next  is  derived  through  the  application 
of  some  specific  strategy.  This  strategy  usually  depends  on 
the  static  evaluator.  For  instance,  we  could  always  pick 
the  initial  successor  of  a  move  based  on  the  static 
evaluations  of  the  successors. 

We  now  present  an  example  to  illustrate  this  pruning 
method.  This  example  is  based  on  our  previous  one  for  the 
minimax  method.  See  FIG.  2.1. 

Starting  from  position  A,  we  generate  only  that  portion 
of  the  tree  that  leads  to  one  terminal  position.  In  this 
case  we  could,  for  instance,  generate  nodes  D  and  J.  We 
apply  the  static  evaluator  to  J  to  get  a  score  of  15.  We  now 
might  generate  position  I  and  score  it  to  get  7.  The  minimax 
method  now  stipulates  that  D  will  have  a  backed-up  score  of 
7.  Next  we  could  generate  positions  C  and  H  and  score  H  to 
get  -3.  At  this  point  we  can  ignore  all  other  successors  of 
C  since  we  know  that  by  the  minimax  algorithm  the  greatest 
value  that  C  could  get  as  a  backed  up  value  is  -3.  Thus  the 
value  of  C  is  less  than  D,  and  hence  the  current  maximum 
score  at  this  level  is  7.  We  can  thus  cut  off  all  successors 
to  C  for  we  know  that  further  expansion  will  not  result  in  a 
higher  value  for  C.  A  similar  process  applied  to  node  B 
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results  in  a  cutoff  of  all  successors  of  B  save  node  F.  The 
actual  part  of  the  tree  searched  and  the  resulting  scores 
are  as  follows: 


A*  7 


B=1 


C=- 3 


F=  1 


H=-  3 


r-  D=7 

I 

1=7 


J=  1  5 


FIG.  2.3  ^lRha^Beta  Pruning 

Notice  that  the  generation  of  the  tree  and  the  search 
proceed  simultaneously,  hence  a  cutoff  means  that  we  never 
generate  or  score  certain  parts  of  the  tree.  The  time  of 
computation  for  a  tree  of  depth  d  of  fanout  f  (fanout  is  the 
number  of  successors  each  node  in  the  tree  has)  for  the 
simple  minimax  is  proportional  to  f**d.  When  alpha-beta 
pruning  is  applied,  we  get  a  time  proportional  to  f** (d/2) 
in  the  best  case  (when  the  principal  continuations  are 
always  explored  first)  and  f**d  in  the  worst  case  (when  the 
principal  continuations  are  always  explored  last) 
[  Nilsson7 1  ").  The  ordering  of  the  search  determines  the 
actual  saving  in  time.  It  is  important  to  realize  that  this 
saving  in  time  does  not  result  in  any  sacrifice  of  accuracy 
in  the  scoring.  The  alpha-beta  algorithm  is  guaranteed  to 
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yield  the  same  results  (and  the  same  principal  continuation) 
as  the  simple  minimax. 

Any  saving  in  time  that  we  can  gain  through  this  method 
can  be  very  important  to  good  play:  with  simple  minimax,  we 
can  search  a  tree  with  fanout  f  and  depth  d  in  time  t.  If 
we  apply  alpha-beta  in  the  optimal  way,  we  can  search  a  tree 
with  fanout  f2  or  a  tree  with  depth  2*d  in  the  same  time  t. 
Thus  our  chances  of  choosing  the  right  move  increase  without 
any  increase  in  computation  time. 


2.  1.  3 


Static  Evaluation 


As  mentioned  before,  we  rely  on  the  static  evaluator  to 
limit  the  fanout  in  tree  generation  and  to  select  the  order 
of  search  in  the  alpha-beta  algorithm.  The  static  evaluator 
is  also  used  to  score  the  terminal  positions. 


The  usual  form  of  the  evaluation  function  is  a  scoring 
polynomial  consisting  of  a  linear  combination  of  heuristic 
scores.  Each  heuristic  scores  some  aspect  of  the  move  (or 
position)  that  is  under  scrutiny. 
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In  fact,  we  don't  really  need  the  same  ordering  as  long  as 
the  top  moves  chosen  by  both  methods  are  the  same  (although 
a  bad  ordering  may  increase  the  time  spent  finding  the 
principal  continuation) • 

To  be  effective,  the  polynomial  must  contain  a 
sufficient  number  of  heuristics  to  measure  all  relevant 
aspects  of  a  position.  Furthermore,  it  is  egually  important 
that  proper  weightings  be  assigned  to  the  heuristics  in 
order  to  reflect  the  relative  importance  of  the  aspects. 
Finally,  the  heuristics  themselves  must  display  certain 
quali t ie  s . 


One  important  quality  of  heuristics  is  what  we  call 
"completeness".  In  order  to  be  complete,  a  heuristic  must 
reflect,  through  its  score,  the  changes  in  the  particular 
aspect  it  is  supposed  to  measure.  As  an  example,  one  such 
aspect  is  the  material  balance  of  a  position,  defined  as  the 
difference  in  the  number  of  pieces.  If  white  has  two  pawns 
and  black  has  four  pawns  then  white's  score  for  the  pawn 
advantage  aspect  is  -2  while  black's  score  for  the  same 
aspect  is  +2.  A  complete  material  balance  heuristic  would 
always  insure  that  this  pawn  score  reflects  the  difference 
in  pawn  material. 


—  30-- 


2 • 2  Errors  in  Conventional  Methods 

The  ultimate  manifestation  of  errors  in  a  chess  program 
is  in  its  inability  to  win  or  draw.  In  order  to  improve  the 
guality  of  play,  we  must  isolate  the  reasons  for  failure  to 
specific  components  or  component  interactions.  It  is  only  in 
this  way  that  we  can  begin  to  make  any  reasonably  effective 
changes. 

The  following  is  an  analysis  of  where  errors  can  occur 
within  the  overall  algorithm  and  how  these  errors  affect  the 
guality  of  play.  Since  there  are  varying  degrees  of  play 
proficiency,  both  in  humans  and  in  chess  programs,  we  would 
also  like  to  have  some  measure  of  the  severity  of  errors. 

We  first  look  at  the  different  types  of  error;  we  then 
develop  bounds  on  these  errors. 

2.2.1  Tree  Search  and  Pruning  Errors 

We  now  examine  how  errors  can  occur  in  tree  search  and 
generation.  There  are  two  distinct  types  of  errors.  We  get 
1X2®  1  errors  as  a  result  of  deficiencies  in  forward 
pruning.  Type_2  errors  are  due  to  errors  in  scoring  at  the 
terminal  nodes. 

At  each  position  in  the  tree,  the  evaluation  function 
ranks  the  moves  in  order  of  preference  and  selects  +  he  top 
few  for  further  expansion.  One  of  the  possible  moves  for  any 
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position  is  the  ideal  move.  This  ideal  move  is  the  start  of 
the  ideal  continuation  for  that  position.  If  the  evaluation 
function  does  not  include  this  move  among  the  top  moves, 
then  a  type  1  error  has  occurred.  We  can  assign  a 
probability  of  occurrence  of  this  error,  A(f),  as  a 
function  of  the  evaluation  function  and  the  fanout  (f) . 

Notice  that  all  type  1  errors  would  disappear  in  a  full 
breadth  tree.  However,  type  1  errors  would  still  occur  in  a 
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2 . 2. 1 . 1  Type  la  Errors 

In  this  error  type,  a  type  1  error  occurs  at  the  root  of 
the  free.  This  means  that  the  ideal  move  is  not  considered 
as  a  viable  next  move  by  the  static  evaluation.  Ncn- 
inclusion  at  this  level  implies  that  the  move  will 
certainly  not  be  made  since  its  ranking  cannot  be  improved 
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by  lookahead.  The  ideal  continuation  is  immediately 
abandoned. 


The  severity  of  •‘-his  error  depends  on  the  nature  of  the 
ideal  continuation  and  the  nature  of  the  other 
continuations:  the  ideal  continuation  could  have  meant  an 
immediate  win  or  it  could  have  been  the  only  escape  from  a 
certain  loss.  In  general,  the  magnitude  of  the  error  is 
equal  to  the  difference  in  the  value  of  the  abandoned  ideal 
continuation  and  the  ideal  continuations  of  the  positions 
accessible  using  the  moves  ranked  in  the  top  f  moves.  No 
recovery  from  a  +  ype  la  error  is  possible,  since  the  choice 
of  move  can  not  be  deferred.  Since  we  assume  that  the 
opponent  makes  the  ideal  reply,  there  will  not  be  another 
chance  to  encounter  this  position  again.  This  non- 
r <=coverability  of  type  la  errors  is  contrasted  to  type  1 
errors  at  lower  levels  of  the  tree,  since  the  positions  at 
lower  levels  may  be  re-evaluated  on  subsequent  moves. 
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2 . 2. 1 . 2  Type  1b  Errors 


This  error  occurs  at  level  2  of  the  tree  expansion.  At 
this  level,  the  polynomial  tries  to  predict  the  reply  to  the 
contemplated  moves.  A  type  1  error  at  this  point  means  that 
the  program  missed  the  best  reply  to  the  move.  The  severity 
again  depends  on  the  ideal  continuation  of  the  missed  reply. 
The  effect  depends  on  the  final  scoring  of  the  original 
moves.  In  other  words,  if  there  are  three  moves  under 
consideration  and  a  type  1b  error  is  made  on  one  of  them. 
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move  ends  up  with  an  erroneously  better  score  than  the  ideal 
move.  Thus  this  error  can  occur  independently  of  type  la 
error  and  may  occur  more  than  once  in  an  expansion.  The 
probability  of  this  error  occurring  is  the  same  as  a  type  1 
error;  however,  the  probability  of  it  having  an  effect  is 
determined  by  the  probability  of  choosing  the  move  where  it 
occurred. 

2  .  2.  1 .  3  Typeti  1c ..  Errors 
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an  effect.  One  possible  effect  of  these  errors  is  that 
worst  and/or  the  best  terminal  nodes  for  the  expansion 
each  of  the  candidate  moves  may  not  be  reached,  hence 
backed  up  values  may  not  reflect  the  true  possible  outc 
of  the  move.  Missina  ideal  replies  lower  down  will  result 
missing  the  worst  terminal  nodes,  while  missing  the  b 
moves  will  result  in  missing  the  best  terminal  nodes.  T 
can  result  in  the  erroneous  ranking  of  the  candidate  mo 
by  lookahead. 

2.2. 1.4  Type_2_Erro rs 

We  now  look  at  the  errors  that  can  occur  at  the  termi 
nodes.  We  call  these  type_2  errors. 


The  method  of  scoring  the  terminal  nodes  in  the  tree 
usually  related  to  our  static  evaluator.  We  employ 
evaluation  function  to  rank  the  possible  moves  from 
terminal  node  and  then  take  the  score  of  the  best  of  th 
moves  as  the  score  for  that  node. 

A  type  2a  error  occurs  when  the  ideal  move  for 
position  represented  by  a  terminal  node  is  not  rated  as 
best  by  the  sta+ic  evaluator.  In  CHUA  tE,  this  error 
similar  to  a  type  1  error  with  f  (the  fanout)  egual  to 
The  result  is  that  the  position  will  not  be  prcpe 

scored,  hence  the  backed  up  value  from  this  position  will 
wrong.  Note  that  such  an  error  has  a  very  wide  range 


the 
of 
t  he 
ome 
in 
est 
his 
ves 


nal 


is 

the 

the 

ese 

the 

the 

is 

1. 

rly 

be 

of 


-35- 


potential  effects. 
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Type  2b  errors  occur  due  to  errors  in  the  absolute 
scores  of  the  terminal  nodes.  This  means  that  it  is  possible 
that,  even  though  each  terminal  node  reflects  the  score  of 
the  ideal  move,  the  values  assigned  to  these  moves  ranks 
their  relative  merit  incorrectly.  Again,  the  backed  up  value 
and  pruning  are  affected .  This  error  type  exemplifies  ■‘■he 
need  to  have  a  uniform  absolute  scale  for  the  scoring  of 
moves,  since  relative  scoring  among  moves  for  one  position 
is  not  enough.  The  probability  of  this  error  is  a  function 
of  the  scale  used  as  well  as  the  evaluation  function. 

2.2.2  Sta+ic  Evaluation 

As  we  outlined  before,  each  heuristic  scores  some  aspect 
of  the  move  (or  position)  that  is  under  scrutiny.  The 
derivation  of  the  final  heuristic  score  is  subject  to  two 
distinct  types  of  errors;  first  in  the  heuristic’s  ability 
to  assess  an  aspect,  and  secondly  in  the  relative  importance 
assigned  to  heuristics  in  the  scoring  polynomial.  The 
combination  of  these  errors  determine  the  probability  of  the 
errors  that  appear  in  tree  search. 
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The  first  type  of  error  is  "incompleteness”.  The 
definition  of  this  problem  follows  from  our  earlier 
discussion  of  "completeness".  A  heuristic  is  incomplete  if 
if  does  not  fully  reflect  the  changes  in  an  aspect.  As  an 
example,  we  again  examine  a  heuristic  that  measures  the 
material  balance  of  pawns.  If  this  heuristic  adjusts  the 
pawn  material  measure  every  time  the  computer  captures  a 
pawn  but  doesn’t  adjust  the  measure  when  the  opponent  takes 
a  pawn,  then  we  would  say  that  the  heuristic  is  incomplete. 


The 

second  type  of  error 

occurs  when 

the  relative 

importance  of  one  aspect 

to 

another  is 

not 

correctly 

defined . 

For  instance,  is 

it 

more  important 

to 

maintain  a 

crood  pawn  structure  or  to  sacrifice  a  good  pawn  structure  in 
order  to  develop  a  piece?  Since  this  type  of  error  is 
crucial  to  ranking  of  moves,  its  impact  is  very  prominent. 
However,  this  type  of  error  is  also  the  hardest  to  isolate, 
since  intuitively  it  is  obvious  that  relative  importance  is 
highly  dependent  on  the  current  position. 

In  order  to  derive  any  meaningful  measure  of  the 
probability  and  severity  of  these  errors,  we  need  some 
absolute  scale  of  scoring.  We  need  to  compare  performance 
with  the  real  assessment  of  positions  and  moves.  In  o+her 
words,  if  we  don’t  know  the  ideal  score  or  move  for 
positions,  then  we  can’t  determine  the  error  in  evaluation. 
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The  use  of  the  same  static  evaluator  for  both  pruning 
and  terminal  evaluation  can  amplify  the  impact  of  the  two 
types  of  errors.  The  relative  importance  of  aspects  may  not 
be  the  same  for  both  of  these  functions.  For  instance,  a 
checking  move  may  be  relatively  unimportant  when  compared  to 
other  moves  at  the  terminal  level,  however,  we  certainly 
would  not  want  to  prune  that  same  move  at  an  intermediate 
tree  level. 


2.2.3  Error  Bounds 

We  now  look  at  how  data  on  type  1  and  type  2  errors  can 
be  used  in  analyzing  the  lookahead  strategy. 

In  deriving  error  bounds  inherent  in  the  lookahead  tree, 
we  must  first  state  some  assumptions  about  the  nature  of 
type  1  errors.  We  assume  that  the  selection  of  a  candidate 
move  set  for  a  particular  position  in  the  tree  is 

of  selection  for  other  positions  at  the  same 
level.  Hence  the  probability  of  type  1  error,  A(f),  is  only 
a  function  of  the  fanout  f  and  the  evaluation  heuristics 
used.  These  heuristics  are  invariant  from  node  to  node.  Thus 
for  nodes  at  the  same  level,  the  outcome  of  type  1  errors 
can  be  considered  as  Bernoulli  trials.  This  allows  us  to 
deal  with  the  type  1  error  at  a  fixed  level  as  a  binomial 
experiment,  and  hence  we  can  apply  the  general  formula  : 
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(n)  k  n-k 

(k)  *  p  *  q 

where  n  -  number  of  trials 

k  -  number  of  "hits" 
p  -  probability  of  a  "hit" 

q  -  (1-p) 

Notation:  Read  this  as  "n  choose  k  times  p  to  the  kth 
power. . . " 

Thus  the  probability  of  type  1  errors  occurring  at  any 
fixed  level,  i ,  of  the  lookahead  tree  is: 


(Ni)  k  Ni  -  k 

(k  )  *  p  *  q 


where 


Ni  -  number  of  nodes  at  level  i 
fi  -  fanout  at  level  i 

p  =  A(fi)  r  A  is  the  probability  of  type  1  error 
q  =  1-p 

k  -  number  of  times  a  type  1  error  occurs 


From  this  general  statement  we  can  develop  several 
statements  about  the  type  1  errors.  First,  we  know  that  at 
the  root  we  have  a  probability  of  A(fl)  of  a  type  la  error. 
We  can  now  further  assert  that  the  probability  of  a  type  1b 
error  at  level  2  (or  more  precisely,  the  probability  of  1  or 
mere  occurrences  of  type  1b)  is: 


fi  (f  D  j  fl-j 

SUM  (j  )  *  A  ( f  2)  *  (1-A(f  2)  ) 

1  =  1 

fi  -  fanout  at  the  root 
f2  -  fanout  at  the  second  level 


where 
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The  degenerate  case  of  the  above,  that  all  second  level 
nodes  have  a  type  1b  error  is  : 

(A (f 2) ) **f1 

One  of  our  main  concerns  in  reaching  master  level  play 
is  that  the  ideal  continuation  be  in  the  lookahead  tree.  We 
can  calculate  the  probability  of  abandoning  this 
continuation  at  level  i  as  the  product  of  the  probability  of 
keeping  the  continuation  until  level  i  and  the  probability 
of  then  making  a  type  1  error.  The  probability  of  including 
the  ideal  move  in  the  candidate  set  for  a  position  is  simply 
1  -  A  (f) ,  where  again  f  is  the  fanout.  Thus  the  probability 
of  missing  the  ideal  continuation  at  level  2  is  (1- 
A  (f 1) ) *A (f 2) .  This  generalizes  to  the  probability  of 
deviation  at  level  i  as  : 

i-  1 

A  (f  i)  *  PFOD  (l-A(fj)) 

j=1 

From  the  above,  we  can  also  derive  the  probability  of 
including  the  ideal  continuation  in  the  tree  as  : 

i 

PROD  ( 1 -A  (f  j) ) 

j  =  1 

This  gives  us  an  idea  of  what  the  chances  are  of  knowing 
where  the  best  move  will  lead.  Even  if  we  assume  perfect 
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terminal  node  scoring  at  level  i,  then  our  forward  pruning 
method  could  still  introduce  gross  scoring  inaccuracies, 
depending  on  the  magnitude  of  A(fi). 

We  now  look  at  the  problem  of  choosing  the  ideal  move, 
given  that  the  ideal  move  is  in  the  candidate  set  at  the 
root  level.  Note  that  inclusion  in  the  candidate  set  only 
means  that  we  consider  the  move,  not  that  we  necessarily 
choose  it  for  the  next  move.  Whether  we  choose  it  or  not 
depends  on  the  outcome  of  the  tree  search  and  how  the  top 
level  candidate  moves  are  finally  ranked.  In  turn,  the  tree 
search  outcome  is  dependent  on  type  1  and  type  2  errors. 

Th°  nature  of  type  1  errors  at  lower  levels  is  such  that 
their  effect  may  not  be  negative.  In  certain  instances, 
errors  cancel  out,  thus  producing  the  same  ranking  that 
perfect  scoring  would  have  produced.  In  actual  fact,  we 
don’t  even  need  identical  ranking  as  long  as  the  best  move 
is  ranked  highest.  This  depends  heavily  on  relative  scoring 
accuracy,  not  on  absolute  scoring,  hence  backing  up  a  wrorcr 
absolute  score  may  not  be  disastrous. 

Let’s  look  at  the  situation  of  a  node  at  a  particular 
level  in  the  tree.  We  are  faced  with  the  problem  of 
deriving  a  score  for  this  node  so  that  the  backup  procedure 
can  proceed.  The  score  can  be  derived  either  by  further  tree 
expansion  from  this  node,  or  from  statically  scoring  the 
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node.  In  both  cases  the  correct  score  depends  on  choosing 
the  best  move  to  be  the  ideal  move.  In  the  latter  case,  this 
means  an  error  rate  defined  by  the  probability  of  type  2 
errors.  In  the  former  case,  we  depend  on  the  error 
possibilities  of  tree  expansion.  We  can  define  the 
probability  of  deriving  a  perfect  score  for  the  node  by 
lookahead  in  the  following  manner: 

Assume  that  we  are  at  a  node  at  level  i.  A.t  level  t, 
t>i,  we  assume  that  terminal  nodes  are  correctly  ranked. 
Thus  we  can  be  certain  that  the  node  is  scored  perfectly  (in 
the  ranked  sense)  if  the  ideal  continuation  for  each  node  of 
the  subtree  defined  by  our  starting  node  is  in  the  subtree. 
The  probability  of  this  is  : 


n-1  Zi 

PPOD  (l-A.(fi)) 
i=k+ 1 


where  k  -  level  of  the  node  we  are  starting  at 

n  -  depth  of  the  entire  game  tree 
Zi  -  #  of  nodes  at  level  i 
calculated  as: 

i-1 

PPOD  fi 
1  =  1 


While  perfect  scoring  insures  that  the  node  will  be 
assigned  the  value  of  the  ideal  move,  it  is  still  possible 
that  the  ideal  move  is  chosen  due  to  cancellation  of  errors. 
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The  probability  of  this  happening  is  a  function  of  the 
subtree  depth,  width,  type  1  errors  and  type  2  errors. 
Therefore,  in  order  to  assess  the  advantage  of  continuing 
tree  search  past  a  certain  level,  we  can  compare  the  chances 
of  a  type  2  error  to  the  chances  of  correct  scoring  based  on 
further  tree  expansion. 

2 . 1  Basis  of  Research  Methods 

We  now  present  the  underlyinq  analysis  of  our  methods. 
Firs*,  we  present  justification  for  how  the  data  base  was 
developed.  Next,  we  look  at  how  we  would  use  performance 
data  in  defining  error  bounds  and  a  program  evaluation. 
Finally,  we  discuss  the  Closeness  To  Win  strategy  (CTW) 
[Horning72j.  This  strategy  is  the  source  of  the  ideal 
scoring  scale  used  in  some  of  the  measures. 

2.3.1  The  Game  Library 

In  order  to  evaluate  a  chess  program,  one  of  the  things 
we  need  to  measure  is  "correctness"  of  move  selection.  For 
a  given  position,  we  would  like  to  compare  the  program's 
move  to  the  ideal  move.  We  can  pick  out  the  ideal  move  in 
two  ways:  we  can  examine  a  complete  game  tree,  or  we  can 
have  a  master  pick  it  out  for  us.  The  first  way  is  not 
feasible.  The  second  way  seems  more  reasonable. 
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The  effect  of  master  selection  is  achieved  if  we  use 
positions  that  have  occurred  in  master  play.  A  master  game 
is,  in  effect,  a  nearly  ideal  path  through  a  game  tree.  The 
problem  of  ensuring  that  this  path  leads  to  the  optimal  win 
is  solved  by  our  selection  of  games;  we  ensure  that  the 
play  is  close  to  optimal  by  the  fact  that  masters  are  making 
the  moves.  Thus  the  assumption  in  using  these  games  is  that 
the  winner  tries  to  win,  the  loser  tries  to  stall  the  loss 
and  the  quality  of  play  ensures  that  "good"  moves  were 
selected  to  attain  these  respective  goals.  At  the  least,  we 
are  assuming  that  the  quality  of  these  games  is  better  than 
that  attained  by  any  chess  program  to  date. 


In  later 

analysi 

s  we  assume 

that  the 

moves  i 

n  the 

library  are  the 

ideal 

moves.  In  the 

strictest 

sense , 

this 

assumption  is 

wrong. 

since  masters 

are  human  an 

d  subject  *0 

imperfections. 

However 

,  until  the  time  that  computers 

play 

master  level  chess,  we  can  use  master  play  as  our  ideal. 
2.3.2  Evaluation 


Comparing  the  program  moves  to  the  ideal  moves  in  the 
library  will  allow  us  to  derive  reliable  error  measures.  In 
particular,  we  can  derive  the  probability  of  type  1  and  type 


2  errors. 

The 

reliability 

of  these  probabilities 

is 

dependent 

on  the 

correctness 

of  the  ’’ideal”  moves  used. 

and 

the  size  of  our  data  sample. 
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The 

evaluation  of  the 

pr oaram 

components 

will  be 

outlined 

in  more  detail  later. 

However, 

we  still 

have  the 

guest  ion 

of  total  program  evaluation. 

The  best 

(and  least 

feasible)  method  of  evaluation  would  be  to  have  the  program 
use  its  full  resources  to  play  through  the  master  library. 
We  could  then  compare  the  move  selection  to  the  ideal  moves. 

In  the  absence  of  such  an  experiment,  the  best  we  can  do 
is  to  try  to  relate  our  knowledge  of  component  interraction 
and  component  evaluation  to  an  overall  evaluation. 

We  can  argue  that  total  evaluation  is  not  a  necessary 
prereguisite  to  improvement.  As  long  as  we  can  establish 
that  a  component  improvement  results  in  overall  improvement, 
and  we  can  measure  component  improvement,  then  we  can 
establish  that  overall  improvement  has  occurred. 

2.3.3  The  CTW  Strategy 

An  alternative  to  the  minimax  game  tree  search  is  the 
CTW  strategy  (see  Appendix  1) .  The  philosophy  of  this 
strategy  differs  from  minimax  in  that  search  is  directed  by 
uncertainty  rather  than  purely  by  absolute  scoring.  This 
means  that  the  probability  of  type  1  and  2  errors  can  be 
included  in  the  considerations  for  pruning.  A  feature  of 
this  strategy  is  the  introduction  of  a  uniform  scoring  scale 
based  on  the  closeness  to  the  end  of  game. 


the  1/N  score 
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Such  a  scoring  scale  is  a  necessary  step  in  trying  to 
eliminate  errors  in  heuristic  scoring.  The  CTW  scoring 
scheme  is  also  "ply  sensitive".  The  backed  up  scores  will 
reflect  the  depth  of  back-up  as  well  as  the  value  of  back¬ 
up.  Since  both  heuristic  and  back-up  scoring  is  on  the  same 
scale,  we  can  expect  a  much  more  homogeneous  scoring  system. 

We  can  apply  the  CTW  scoring  method  in  the  following 
way:  the  scoring  polynomial  assigns  a  "closeness  to  win"  to 
a  move  in  the  range  of  -l(lose)  to  0  (draw)  to  +1  (win) .  This 
same  type  of  measure  is  used  in  the  tree  lookahead.  From  a 
given  position,  if  a  move  results  in  a  win  in  N  ply,  then 
the  value  of  the  move  is  1/N,  if  a  loss  in  N  ply  then  -1/N. 
Naturally,  the  game  tree  may  contain  a  terminal  win/loss  at 
more  than  one  level,  but  the  ideal  score  is  obtained  by 
noting  at  which  level  a  win/loss  is  inevitable  given  that 
the  winner  will  make  moves  to  achieve  this  win  and  the  loser 
will  make  moves  to  avoid  a  loss  as  long  as  possible. 

Another  aspect  of  the  CTW  strategy  is  the  need  to 
calculate  an  uncertainty  score  for  a  position.  There  are 
two  areas  where  we  could  look  for  uncertainty:  one  is  the 
uncertainty  inherent  in  the  scoring  of  ordinary  heuristics, 
and  another  is  the  uncertainty  inherent  in  the  position 
i*self.  In  the  first  case,  we  would  want  to  derive  some 
measure  of  the  probability  of  error  of  heuristics  and 
calculate  the  uncertainty  as  a  function  of  this  measure. 
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This  seems  a  very  natural  approach,  since  the  need  for  the 
uncertainty  measure  is  precisely  dictated  by  the  fact  that 
we  can't  totally  rely  on  the  correctness  of  the  heuristic's 
assessments.  In  the  second  case,  we  could  have  specialized 
heuristics  whose  only  function  is  the  calculation  of 
uncertainty  in  positions.  Determining  uncertainty  from  the 
original  polynomial  presents  the  same  problem  as  determining 
errors  in  evaluation.  We  need  the  ideal  scores  and  moves  in 
order  to  derive  uncertainty. 

We  have  used  the  CTW  strategy  throughout  the  research  as 
a  philosophical  guide.  We  have  also  adopted  some  of  its 
specific  techniques  (like  *he  1/N  scoring),  and  made  many  of 
the  same  underlying  assumptions. 

However,  we  believe  that  the  measuring  techniques  that 
we  developed  can  be  successfully  applied  without  reliance  on 
one  specific  lookahead  strategy.  In  particular,  the  ideal 
scoring  scale  need  not  be  1/N.  The  only  thing  that  is 
required  is  that  the  properties  of  an  ideal  scale  be 
maintained.  These  properties  are,  basically,  that  the  scale 
be  an  absolute  measure  and  that  it  reflect  certainty  and 
distance  to  win  or  loss. 
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3  •  o  methods 

In  this  chapter  we  outline  our  methods  in  qreater 
detail.  Along  with  the  actual  mechanics  of  the  experiments, 
we  also  provide  their  justification. 

W c  begin  with  the  discussion  of  the  game  library;  how  it 
was  created  ar.d  what  gualities  it  has.  Next,  we  examine  the 
manner  in  which  this  library  was  used  in  order  to  give  us 
the  raw  data  on  CHUTE'S  performance.  Finally,  we  outline 
how  the  raw  data  was  analyzed  and  what  we  expected  from  the 
analysis. 

3 . 1  The  Game  library 

The  first  step  of  the  research  involved  the  gathering  of 
qames  for  the  game  library.  Ninety-eight  games  were 
selected  as  suitable  and  coded  into  machine  readable  form. 
The  games  were  then  further  processed  by  CHUTE  to  produce  a 
file  of  4353  scored  position/move  records. 

The  games  that  were  included  in  the  library  had  to  meet 
certain  criteria.  First,  the  calibre  of  the  players  had  to 
be  at  the  master  or  grandmaster  level;  hence  most  of  the 
games  were  taken  from  the  records  of  international 
tournament  competition.  The  games  were  heavily  annotated  and 
analyzed  in  the  publications  where  they  appeared  and  it  was 
no*  difficult  to  exclude  games  that  were  one-sided  or  games 
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that,  on  close  analysis,  exhibited  a  lot  of  questionable 
moves.  Such  questionable  moves  (marked  as  '?’)  are 
discussed  in  more  detail  in  a  later  section.  One  important 
criterion  was  that  the  game  had  to  have  a  winner.  In  other 
words,  drawn  games  were  not  used  in  the  game  file.  This 
feature  was  necessary  for  the  purposes  of  scoring  the  games. 
In  short,  the  qualities  the  games  had  to  meet  were  that  the 
players  should  be  very  competent  and  the  outcome  of  the  game 
should  be  determined  by  superior  play  rather  than  by  obvious 
blunders. 

The  transformations  needed  to  make  the  game  library 
suitable  for  processing  by  CHUTE  were  minimal.  The  method  of 
notation  used  by  *he  publications  was  the  Standard  Chess 
notation.  Thus  games  were  copied  directly  to  a  computer 
readable  medium.  Although  CHUTE  is  capable  of  move 
specification  in  the  Standard  notation,  certain 
clarification  and  editing  functions  were  needed.  These 
included  substitutions  of  "N"  for  "Kt",  for  "x"  etc.  The 
notation  accepted  by  CHUTE  has  a  few  limitations.  For 
example,  certain  move  specifications  that  are  acceptable  in 
the  Standard  notation  are  ambiguous  to  CHUTE.  This  situation 
arises  in  capture  moves  and  checking  moves.  In  order  to 
overcome  these  difficulties,  a  simple  editing  program  and 


manual  corrections  were  needed 
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We  describe  the  present  status  of  the  game  library  in 
more  detail  in  Appendix  4.  This  appendix  contains  a 
description  cf  the  format  of  the  file  as  well  as  information 
on  how  to  access  it. 

3 . 2  Data  Gather ing 

In  order  to  assess  CHUTE'S  performance,  we  needed  raw 
data  on  the  performance  of  CHUTE'S  components.  This  data  was 
gathered  by  examining  CHUTE'S  reaction  to  the  games  in  the 
game  library. 

We  were  mainly  interested  in  the  performance  of  the 
scoring  polynomial  in  CHUTE,  hence  the  lookahead  facilities 
were  disabled.  The  reader  should  keep  this  in  mind  during 
the  following  discussions.  Also  note  that  CHUTE  uses  the 
same  polynomial  for  forward  pruning  and  terminal  position 
evaluation. 

The  positions  of  each  game  in  the  game  library  were 
scored  and  processed  in  order  to  produce  a  file  of  position 
and  move  data.  This  scoring  was  done  in  two  stages.  The 
game  was  first  scored  by  CHUTE  and  then  by  a  back-up  method 
based  on  the  CTW  strategy. 

3.2.1  Initial  Scoring 

In  the  first  stage,  CHUTE  processed  the  game.  In  effect, 
a  game  was  played  through  by  CHUTE,  with  the  exception  that 
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the  positions  for  both  sides  were  forced  to  be  the  positions 
of  the  game  being  processed.  The  output  of  this  step,  for 
each  move,  consisted  of 

1)  the  board  position  in  CHUTE'S  internal 
representation 

2)  the  side  to  play  (Black  or  White) 

3)  the  master's  move 

4)  the  seven  ordered  moves  that  CHUTE  considered  as 
best 

5)  the  value  of  each  heuristic  for  the  master's  move 

6)  the  value  of  each  heuristic  for  the  move  rated 
highest  by  CHUTE 

Notice  that  in  case  CHUTE  agreed  with  the  master's 
choice  of  move,  items  5  amd  6  were  the  same. 

The  heuristics  that  we  used  for  scoring  were  the 
heuristics  in  CHUTE'S  scoring  polynomial.  The  following 
table  gives  a  summary  of  the  function  of  each  of  the  22 
heuristics.  We  refer  the  reader  to  Appendix  2  for  a  more 


complete  description 
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STIC  # 

ALIAS 

DESCFI PTION 

0 

HEUFMOC 

attack  a  passed  pawn 

1 

HEUFMO 1 

attack  an  attacking  piece 

2 

HEUFM02 

leave  an  attack 

3 

HEUFMO  3 

expose  to  attack  or  defence 

u 

HEUPM04 

interpose  an  attack 

5 

HEUFMO 5 

add/delete  defenders 

6 

HEUFM06 

attack  an  opponent  piece 

7 

HEUFM07 

development 

8 

HEUFMO 8 

freedom  for  bishop,  knight 

9 

HEUFMO  9 

attack  last  moved  piece 

10 

HEUFM10 

rook  on  open  file 

1 1 

HEUFM1 1 

rook  on  own  passed  pawn  file 

12 

HEUPM1 2 

value  for  captured  piece 

13 

HEUFM1 3 

specials (castling, er -passant ,e 

14 

HEUFM1 4 

sacrifice  vs.  saved 

15 

HEUFM 1 5 

double  up  pieces 

16 

HEUFM1 6 

attack  squares  around  king 

17 

HEUFM 1 7 

create/avoid  pin  on  to-sauare 

1  8 

HEUFM 1 8 

pawn  considerations 

19 

HEUPM1 9 

block  pins 

20 

HEUFM2C 

discourage  successive  moves  wi 
the  same  piece 

21 

HEUFM2  1 

endgame,  get  king  into  action 

TABLE  3. 1 

-  Heuristic  Descriptions 
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3.2.2  ck -ug  Scoring 

The  output  file  of  a  scored  game  was  next  processed  by 
the  back-up  routine.  First,  a  back-up  score  was  assigned  to 
each  move.  This  score  was  calculated  as  1/N,  where  N  was  the 
number  of  plies  from  the  position  scored  to  the  end  of  the 
game.  For  the  winner  of  the  game,  this  score  was  positive, 
for  the  loser  it  was  negative.  For  instance,  for  a  move 
twenty  ply  from  the  end,  the  backed  up  score  was  1/20  if  the 
winner  was  to  move,  and  -1/20  if  the  loser  was  to  move. 

Another  function  of  the  back-up  routine  was  to  convert 
the  heuristic  scores  assigned  by  CHUTE  from  a  move  score  to 
a  position  score.  This  was  necessary  for  the  purposes  of  the 
analysis  and  is  justified  later. 

The  values  of  the  converted  heuristic  scores  for  any 
particular  position  reflect  the  cumulative  effect  of  the 
move  scores  for  all  previous  positions.  Let  H(n,i)  be  the 
new  derived  heuristic  score  for  heuristic  n  at  position  i 
and  let  M(n,i)  be  the  old  score  for  the  heuristic  n  at 
position  i.  We  assume  that  H(n,0)  is  0  and  H(n,1)  =  M(n,1). 
The  general  case  is  then  found  by 


H  (n,  i)  =  H  (n ,  i-2)  -  M(n,i-1)  «■  M(n,i) 
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This  scoring  method  reflects  the  value  of  the  aspect  of 
the  position  scored  by  the  heuristic.  This  value  is  affected 
by  both  the  side  to  move  and  the  opponent’s  previous  moves. 


All  the 
this  manner, 
contained 


heuristic  scores  on  the 
The  records  of  the  game 


file  were  converted  in 
file  at  this  point 


1) 

the 

board  position 

2) 

the 

side  to 

play 

3) 

the 

master' 

s  move 

t  he 

seven  o 

rdered  mo 

ves  cf 

CHUTE 

5) 

the 

value 

of  each 

heuristic 

position 

6) 

the 

value 

of  each 

heur ist 

ic  f  o 

CHUTF's  best  move 


the  master’s 


position  from 


7) 

Eacked-up  1/N 

score 

3.2.  3 

Position 

ions 

One 

of  the  common 

features  of 

of*en  end  in  resignation.  Since 
assumes  that  the  ending  is  a 
convention  was  adopted  to  deal  with 


master  games  is  that  they 
the  1 /N  scoring  method 
checkmate,  the  following 
the  anomaly.  If  the  game 
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was  resigned,  then  the  winner  was  'awarded'  a  checkmate  4 
ply  from  the  point  of  resignation  if  the  number  of  moves  so 
far  in  the  game  was  greater  thaii  30,  and  6  ply  if  the  number 
of  moves  was  less  than  30.  If  the  game  ended  in  checkmate, 
then  no  modifications  were  needed. 

Our  rationale  behind  this  measure  was  that  a  resignation 
in  the  early  part  of  the  game  would  be  due  to  a  hopeless 
position,  whereas  near  the  end  the  loser  could  see  an 
impending  checkmate.  This  is  a  gross  generalization,  but 
some  simple  mechanism  had  to  be  developed  that  would 
differentiate  resignations  from  checkmates  without  expensive 
analysis,  or  more  precisely,  guessing  as  to  the  motivation 
for  the  resignation. 

Some  further  selection  of  the  position  records  had  to  be 
made  in  order  to  derive  the  final  set  of  positions  for  the 
library.  This  selection  was  based  on  the  presence  of  a  '?' 
move  in  the  game.  As  explained  before,  a  '?'  move  means  that 
the  master  move  was  clearly  not  the  optimal  move  for  the 
position.  The  validity  of  the  1/N  score  depends  on  the  moves 
after  a  move  being  optimal.  Thus,  only  the  portion  of  a  game 
that  had  no  '?'  moves  after  any  given  move  was  usable.  Sc 
all  the  records  prior  to  and  including  a  '?'  move  were 
discarded.  Note  that  this  discarding  occurred  as  the  very 
last  step.  The  cumulative  heuristic  scores  of  the  retained 
records  reflected  the  summation  over  the  whole  game.  The 


—  55- 


reasons  for  this  will  become  more  apparent  in  the 
discussions  of  the  analysis. 

3 • 3  Analysis  of  Data 

The  scoring  of  the  game  library  provided  us  with  the 
necessary  raw  data  for  further  analysis.  We  now  had,  for 
each  position,  the  ideal  move  and  its  ideal  score.  We  also 
had  information  on  how  CHUTE  ranked  the  ideal  move  in 
relation  to  its  own  choices.  The  following  analyses  that  we 
performed  helped  to  condense  and  summarize  this  mass  of 
data. 


3.3.  1 


Ranking 


A  ranking  measure  similar  to  the  one  employed  by  Samuel 
was  used  to  measure  both  the  original  effectiveness  of  CHUTE 
and  the  effectiveness  of  the  derived  results.  As  mentioned 
before,  a  measure  of  the  predictive  power  of  the  scoring 
polynomial  is  how  it  ranks  moves  as  compared  to  lookahead 
ranking.  If  we  assume  that  the  master's  move  is  the  best 
move  in  any  position,  then  we  can  measure  how  the  polynomial 
ranks  the  best  move.  The  ranking  can  give  us  the  values  of 
the  type  1  error  probability  as  a  function  of  fanout. 


The  actual  measure  was  calculated  as  follows:  The  sever, 
best  moves,  as  picked  by  CHUTE  for  each  position,  were 
examined.  The  relative  rank  of  the  master  move  was  then 
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noted  and  the  results  tallied  to  produce  the  following  set 
of  measures: 

PI  P2  P3  P 4  P5  P6  P7  P8 

where  Pi  represents  the  percentage  of  cases  examined  where 
the  master’s  move  was  ranked  within  the  first  i  moves,  and 
P8  represents  the  percentage  of  cases  where  the  master’s 
move  was  not  in  the  top  sever,  moves.  We  could  interpret  1- 
Pi  as  the  probability  of  type  1  error,  A(i)  for  fanout  i. 

We  applied  this  measure  to  the  game  library  four 
different  times.  Each  time  we  varied  the  portion  of  the 
file  that  we  used.  This  was  done  in  order  to  see  how  well 
CHUTE  did  in  different  parts  of  a  game. 

The  first  run  was  over  the  whole  file  with  no  data 
exclusion.  This  provided  us  with  a  general  evaluation  of 
the  polynomial.  We  next  used  only  those  portions  of  the 
file  that  occurred  in  the  beginning  of  the  game.  This 
included  all  positions  of  ply  less  than  16.  The  next  run 
examined  the  mid-game,  plies  16  to  60.  Finally,  we  looked  at 
the  end  game,  plies  over  60.  These  ranges  are,  of  course, 
gross  generalizations;  actual  game  phase  ranges  are  highly 
dependent  on  the  particular  game  examined. 

We  next  tested  modifications  to  the  scoring  polynomial. 
The  ranking  measure  was  a  convenient  way  of  detecting  the 
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effects  of  these  modifications.  The  test  data  this  time  was 
a  set  of  3  games.  This  smaller  file  was  different  from  the 
other  data  we  gathered  in  the  details  of  scoring.  In  the 
regular  file,  we  had  the  score  of  each  heuristic  for  the 
ideal  move  and  CHUTE  * s  move  only.  In  this  special  file,  we 


had  complete  scoring 

of  all  candidate  moves. 

This 

me 

ant 

that  each  possible 

move  for  a  position  was 

scored. 

Th 

us , 

when  we  varied  the 

polynomial  weighting,  we 
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fu 

iiy 

assess  the  changes  in  ranking. 

Again,  we  had  four  runs  on  the  test  data.  The  first  run 
was  the  special  control  run.  For  this  run  we  used  the 
unmodified  polynomial,  hence  we  could  tell  CHUTE* s  normal 


response  to 

this 

part icul 

ar  set  of 

games.  This 
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The  third  and  fourth  runs  were  done  with  new  weightings 
for  the  heuristics  in  the  polynomial.  A.gain,  we  leave  +o 


the  next  chapter  the  details  of  these  weightings 
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3.3.2  Scoring  Statistics 


We  performed 
heuristic  scoring 
analysis  would 
consistently  right 


some  basic 
scale, 
isolate 
or  consist 


statistical  analysis  on  CHUTE'S 
We  hoped  that  this  type  of 
those  heuristics  which  were 
ently  wrong. 


In  the  first  measure,  we  used  the  scores  assigned  by 
CHUTE.  In  this  particular  case,  these  scores  were  not 
converted  to  positional  scoring.  For  each  position,  we  had 
a  score  from  each  heuristic  for  the  ideal  move  and  for 
CHUTE'S  first  move. 

We  calculated  the  means  and  standard  deviations  of  the 
scores  for  each  heuristic  in  the  two  groups:  the  ideal 
moves  and  the  CHUTE  first  choice  moves.  We  also  calculated 
the  mean  and  standard  deviations  of  the  differences  in  the 
scores.  This  set  of  measures  was  run  over  two  data  samples: 
the  whole  game  file  and  the  part  of  the  game  file  where  the 
CHUTE  first  choice  and  the  master  move  were  different. 


Another  measure  that  we  hoped  would  isolate  heuristic 
characteristics  was  a  correlation  measure.  We  calculated 
the  correlation  between  each  heuristic's  score  for  the  ideal 
move  and  the  CTW  score  for  the  same  move.  For  this  measure 
we  used  the  converted  position  score  for  each  heuristic. 
The  assumption  that  we  made  in  doing  this  was  that  if  the 
heuristic  were  complete,  then  we  would  get  a  positive 
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correlation.  This  presumes  that 
appropriate  scale,  on  which 
nearly  linear. 


the  CTW  score  is  on 
a  "good"  heuristic  would 


an 

be 


3.3.3  Heuristic  Weightings 

The  purpose  of  this  experiment  was  to  provide  us  with  an 
evaluation  of  the  scope  of  the  scoring  polynomial.  In 
particular,  we  wanted  to  measure  the  polynomial  in  terms  of 
an  absolute  scoring  scale.  We  also  wanted  to  determine  if 
the  set  of  heuristics  in  the  polynomial  was  complete  and  to 
derive  a  better  set  of  weightings  for  the  heuristics. 


This  experiment  was  similar  to  one  done  by  Samuel,  where 
the  static  score  for  a  move  was  compared  to  the  backed-up 
score  derived  through  the  lookahead  in  the  game  tree.  The 
difference  arises  in  that  the  position  file  we  used  to 
provide  a  backed-up  score  had  a  much  higher  reliability  than 
Samuel’s  method  allowed.  We  performed  multiple  regression 
analysis  of  the  heuristic  scores  to  the  ideal  1/N  score  in 
order  to  suggest  the  new  weightings.  We  would  also  see  what 
percentage  of  the  variance  of  the  1/N  score  we  could  account 
for  by  using  a  linear  combination  of  the  22  heuristics.  The 
model  is  as  follows. 


The  master’s  move  was  scored  using  the  1/N  m 
score  should  be  derivable  as  the  result  of  t 
polynomial.  Hence  the  ideal  1/N  score  was  rega 


ethod.  This 
he  scoring 
rded  as  the 


60  — 


dependent  variable  and  +  he  heuristic  scores  were  regarded  as 
the  independent  variables.  Using  a  large  number  of  such 
variable  n-tuples,  namely  the  data  on  the  position  file, 
multiple  regression  analysis  would  produce  a  linear 
combination  of  the  independent  variables.  This  would 
estimate  the  dependent  variable  with  minimum  mean  sguare 
error.  This  linear  combination,  or  polynomial,  could  then 
be  used  to  predict  values  of  the  dependent  variable.  The 
predictive  equation  we  derived  is  as  follows: 

score=  w0*h0  +  w1*h1  +  ...  +  w21*h21  +  c  +  r 


where  wi  are  the  weightings,  hi  are  the  heuristic  scores,  c 
is  a  constant,  and  r  is  the  residual  error.  The  residual  has 
mean  zero  and  the  smallest  standard  deviation  possible  for 
any  linear  combination  of  the  heuristic  scores.  Hence  for 
■*-his  set  of  heuristics,  over  this  collection  of  data,  the 
above  polynomial  provides  the  optimal  linear  prediction 
function. 

The  value  of  regression  is  in  assessing  the  overall 
effectiveness  of  the  oricrinal  weightings.  The  weightings 
suggested  by  the  regression  can  be  compared  to  the  original 
weightings  and  used  to  determine  how  well  the  original 
polynomial  predicts  the  linear  relationship  of  the 
heuristics  to  the  CTW  score.  The  new  weightings  can  also  be 
interpreted  as  a  measure  of  heuristic  effectiveness.  A  high 
score  means  that  the  heuristic  is  a  good  predictor  of  CTW, 
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while  a  low  or  negative  score  means  that  it  is  a  poor 
predictor . 

The  residual  error  (r2)  measure  gives  us  an  upper  bound 
on  the  expected  reliability  of  the  heuristic  set  used  in  the 
polynomial.  This  points  out  any  possible  weaknesses  in  the 
heuristics  themselves  as  well  as  the  possibility  of  non¬ 
linear  relationships  cf  the  heuristics  to  the  CTW  score. 

The  weightings  derived  and  the  effectiveness  of  the  new 
pclynomial  are  highly  dependent  on  the  success  of  +he 
original  regression  fit.  This  is  measured  in  terms  of  the 
percentage  of  the  variance  of  the  dependent  variable  that 
the  polynomial  accounts.  If  this  measure  is  low  to  begin 
with,  then  the  new  polynomial  can't  be  expected  to  be  very 
effective. 
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e  file  (e.g.  ,  plies  1  to  15,  plies  16  to  60  and  plies 
d  up)  . 
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The  6th  measure  was  also  a  material  measure,  but  it 
calculated  as  a  combination  of  the  above  scores.  The  act 
measure  was  : 

MATRL  =  AD VO  2  *  5*ADV03  +  3. 25*ADV04  +  3. 5*ADV0 

9.75*ADV06 

This  score  was  based  on  the  piece  weightings  used 
Greenblatt  f Greenblatt 67 ]. 


3.3.4  Uncertainty 


The  uncertainty  score  of  a  position  is  a  vital  part 
the  CTW  strategy.  There  should  be  a  logical  relations 
between  the  scoring  of  a  move  and  the  method  by  which 
uncertainty  measure  is  derived.  Thus  we  employed 
following  method  to  devise  an  uncertainty  measure  based 
the  existing  heuristics.  We  hoped  that  the  result  would  g 
us  a  measure  of  which  of  the  heuristics  most  accurat 
predicts  the  presence  of  large  residual  errors  from 
scoring  polynomial. 


Although  this  type  of  measure  is  not  immediat 
applicable  to  evaluation,  it  is  necessary  for  the  possi 
implementation  of  the  CTW  strategy.  The  method  was 
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Dug  to  the  regression  experiments  we  had  the  generalized 
scoring  polynomial 

P=w0*h0  +  w1*h1  ♦  ....  +  w21 *h2 1  ♦  c  +  r 

For  a  given  position  we  could  calculate  an  error  term  as 
f  cllows 

r  2  =  (p-  (v0*h0+w1*h1  +  . . . w21*h21+c)  )  2 

This  represented  the  error  of  the  predictor  for  this 
particular  case.  The  position  file  was  read  by  a  program 
that,  for  each  position,  calculated  r2  and  produced  a  file 
containing  r2  and  the  heuristic  values  that  contributed  to 
its  calculation.  This  resultant  file  was  then  analyzed  using 
regression  to  produce  a  predictor  polynomial  of  the  form 
r 2  =  B0*h0  +  B1*h1  +  ...  +B21*h21 

So  we  derived  two  polynomials,  one  that  scored  a  move 
as  to  its  expected  win  and  one  that  estimated  the  accuracy 
of  that  expectation.  Both  polynomials  were  based  on  the 
same  set  of  heuristics. 


4.0  FESULTS 


The  results  of  the  experiment  were  very  fruitful  in 
providing  insights  into  the  interrelation  and  effectiveness 
of  the  heuristics  in  CHUTE.  In  many  cases,  the  results  of 
the  experiments  confirmed  problem  areas  that  had  been 
previously  identified  on  an  intuitive  basis[ Vranesic7 5  ]. 
Most  of  the  results  relate  to  heuristic  scoring  efficiency. 
We  also  gained  considerable  insight  into  the  potential 
problems  of  implementing  the  CTW  search  strategy  within  the 
framework  of  CHUTE. 


I 

n  t 

his  chapter. 

w 

e 

w 

exper 

imen 

ts  described  i 

n 

4. 

U 

he 

f  irst 

at 

what  we  lear 

ne 

a 

in 

Next, 

we 

present  the  re 

su 

Its 

we  g 

i  ve 

an  interpreta 

ti 

on 

t 

ob  jec 

tive 

s  of  the  resea 

rc 

h. 

4 . 1  Data  Gathering 
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One  of  these  problems  emerged  when  we  tried  to  convert 
the  heuristic  scoring  from  move  to  position  scoring.  Recall 
that  the  derived  score  for  a  particular  heuristic  at  any 
given  position  depends  on  the  cumulative  effect  of  that 
heuristic*s  original  scores  for  all  previous  moves  by  both 
sides.  Certain  heuristic  measures  were  found  not  to  be 
applicable  to  position  scoring. 

As  an  example  of  this  problem,  let  us  examine  the 
derived  score  of  heuristic  20  on  a  sequence  of  3  moves.  let 
these  moves  be  knight  B1  to  C3,  same  knight  C3  to  D1 ,  and  a 
pawn  move  B2  to  B3.  If  we  make  the  moves  in  the  sequence  of 
knight  move,  knight  move,  then  pawn  move.  Heuristic  20  will 
result  in  some  negative  score  due  to  the  repetition  of 
knight  moves.  However,  if  we  take  the  sequence  a  knight 
move,  the  pawn  move,  and  then  the  second  knight  move, 
heuristic  20  will  not  comment  on  the  moves  at  all;  hence  the 
result  for  the  final  position  for  this  sequence  will  be  0. 
Clearly,  both  sequences  of  moves  result  in  the  same 
position,  however,  heuristic  20  will  reflect  different 
derived  scores.  The  same  kind  of  analysis  can  be  applied  to 
heuristic  9  and  to  a  lesser  degree  to  heuristic  21. 


The  problem  of  heuristic  incompleteness  also  had  a 
direct  effect  on  move  to  position  scoring  conversion.  In 
positional  scoring,  a  heuristic  score  is  supposed  to  reflect 
the  "status"  of  one  aspect  of  the  position.  We  derived  such 
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scores  as  the  cumulative  effect  of  move  heuristics.  Thus  if 
a  move  heuristic  was  incomplete,  then  the  resulting 
cumulative  score  was  certainly  incomplete. 

We  found  certain  heuristics  in  CHUTE  to  be  inherently 
incomplete  by  design.  Completeness  is  a  vital  issue  in  bo4-! 
positional  and  move  scoring. 

As  an  illustration  of  incompleteness,  consider  heuristic 
18.  Here  we  have  a  partial  scoring  of  the  pawn  structure 
aspect.  This  heuristic  awards  points  if  a  pawn  structure  is 
created,  but  there  is  no  scoring  done  if  a  pawn  structure  is 
destroyed  by  the  mover  or  the  opponent.  So  if  heuristic  18 
is  used  as  the  adjuster  for  the  score  of  the  pawn  structure 
aspect,  we  will  have  an  inaccurate  score.  A.  similar 
situation  occurs  with  heuristics  0 (passed  pawn  attack) , 
1  (attack  an  attacker).  The  commonality  of  this  problem  seems 
to  be  the  heuristic’s  inability  to  detect  the  undoing  of  a 
situation  that  was  previously  scored.  If  we  have  the 
situation  that  in  ply  10,  white's  knight  attacks  a  passed 
pawn  but  subsequently  moves  elsewhere  without  capturing  the 
pawn,  then  heuristic  0  cumulative  score  will  only  reflect 
the  original  threat  but  not  the  removal  of  the  threat. 

The  following  is  a  list  of  4:he  heuristics  that  we  found 
to  be  incomplete  by  design: 
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Heuristic  0  -  detects  attack  on  passed  pawns,  doesn't 

detect  the  removal  of  the  attack 

Heuristic  1  -  detects  an  attack  on  an  attacker,  doesn't 

detect  the  removal  of  the  attack  on  the  attacker. 

Heuristic  6  -  detects  attack  on  an  opponent,  doesn't  detect 
the  removal  of  the  attack  on  an  opponent  piece. 

Heuristic  11  -  detects  rook  on  own  passed  pawn  file, 
doesn't  detect  the  removal  of  a  rook  from  own  passed 
pawn  file 

Heuristic  10  -  detects  moving  a  rook  into  an  open  file, 

doesn't  detect  moving  a  rook  from  an  open  file. 

Heuristic  15  -  detects  doubling  pieces,  doesn't  detect  un¬ 

doubling  of  pieces. 

Heuristic  16  -  detects  attack  on  enemy  king  squares,  doesn't 
detect  removal  of  such  an  attack  threat. 


Heuristic  18  -  scores  the  creation  of  pawn  structures, 

'  doesn't  score  the  destruction  of  pawn  structures. 

The  following  heuristics  are  partial  measures  of  the 
same  aspects.  Individually,  they  are  incomplete: 
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Heuristics  2,3,4,  (5)  -  lir.es  of  attack  and  defence  on  own 

pieces 

Heuristics  12,14  -  material  advantage  measures 
Heuristics  17,  19  -  pins 

4 . 2  Data  Analysis 

We  now  present  the  results  of  the  data  analysis.  We 
look  at  the  ranking  results  first,  then  the  results  from  the 
simple  statistical  analysis,  and  finally,  the  results  of  the 
regression  analysis. 

4.2.1  Banking 

The  detailed  results  of  these  experiments  can  be  found 
in  Appendix  3.  A  summary  can  be  found  in  graph  form  in  FIG 
4.1.  The  results  on  the  graphs  can  be  interpreted  as  the 
probability  of  the  occurrence  of  Type  1  errors  as  a  function 
of  fanout. 

The  first  graph  represents  the  results  of  the  ranking 
measure  over  the  whole  data  base.  We  can  see  the  effects  of 
data  selection  on  the  basis  of  ply.  The  result  that  error 
rate  decreases  in  the  endgame  is  of  little  consolation  since 
by  that  time  the  ideal  continuation  will  very  likely  be  a 
draw  or  worse.  As  we  have  seen  in  Chapter  2,  the  effect  of 


-70- 


type  la  errors  can  be  devastating  to  good  play,  and  these 
results  strongly  suggest  that  deficiencies  in  the  static 
evaluator  would  prevent  master  play,  even  if  there  were  r.o 
problems  with  the  search  strategy. 

The  second  graph  represents  measures  taken  over  the  test 
file  of  3  games.  The  effectiveness  of  CHUTE  over  this  file 
was  different  than  over  the  entire  data  base.  Eut  now  the 
focus  is  on  comparing  the  relative  change  in  the  ranking 
while  varying  the  scoring  polynomial.  The  intention  here  was 
not  to  provide  an  absolutely  scaled  evaluation  of  the 
polynomial.  The  unaltered  polynomial  can  be  seen  to  have  an 
effectiveness  somewhat  higher  than  before. 

The  ranking  experiment  that  produced  very  interesting 
results  was  the  one  in  which  we  left  out  certain  heuristics 
from  the  scoring  polynomial.  The  ones  excluded  were  those 
that  were  negatively  correlated  to  the  CTW  score.  The 
included  heuristics  were  3,  4,  6,  7,  9,  11,  12,  13,  14,  15, 
16,  and  19.  It  would  appear  from  the  result  that  the 
excluded  heuristics  did  not  contribute  to  the  reliability  of 
scoring.  (As  a  matter  of  fact,  our  measure  over  a  fanout  of 
3  increased  by  115.)  This  is  a  dramatic  statement  about  the 
ten  excluded  heuristics,  but  should  be  verified  by  an 
experiment  with  more  games. 
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We  also  measured  the  effects  on  type  1  errors  of 
changing  the  heuristic  weightings.  The  first  set  of 
weightings  were  those  tha+  were  derived  as  a  result  of 
regression  analysis  over  the  whole  file.  The  ranking  of 
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4. 2.  2 

Scoring  Statistics 

The  results  of  calculating 
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are 

summa  rized 

in  Table  4.2.1.  In 

Table 

4.2.2, 

we  see  the 

results  of 

these  measures  over  that 

part 

of  the 

data 

base 

where  the 

CHUTE  choice  move  and  the 

master 

move 

were 

different. 

The  scores  used  in 

these 

tests 

were 

the 

unconverted  heuristic  scores  for 

each 

move. 

we 

first 

examine  the  results  in  Table  4.2.1. 

These 

results  provide  a  comparison  between  the  way 

each 

heuristic 

reacted  to  the  master 

move 

and  to 

the 

move 

selected 

by  the  polynomial.  The 

significance 

of 

these 

results  is 

that  of  the  22  heuristics,  only  one 

(heuristic 

21)  consistently  favoured  the  master  move.  This  has  a 
severe  effect  on  type  2  errors,  where  choosing  the  master 
move  as  the  best  is  crucial.  For  the  purposes  of  ranking  the 
ideal  move  among  the  candidate  set,  the  best  indicator  of 
heuristic  ef fecti veness  is  the  difference  between  the  master 
move  mean  and  CHUTE'S  choice  mean. 

Another  feature  of  the  results  is  that  the  material 
measure  (i.e.,  heuristics  12  and  14)  is  shown  to  be  the 
dominant  factor  in  the  scoring  errors.  Any  efforts  at 
improvement  should  initially  concentrate  on  correcting  these 
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heuristics,  since  their  combined  "penalty”  of  101  points  far 
outweighs  the  total  contributions  of  the  other  heuristics. 

The  standard  deviation  of  the  difference  gives  a  clue  to 
the  "randomness"  of  the  error  in  scoring.  A  high  standard 
deviation  of  the  difference  indicates  that  the  difference 
between  the  master  move  score  and  the  CHUTE  move  score 
fluctuated  "wildly"  about  the  mean  of  the  difference.  This 
is  to  be  contrasted  with  a  low  standard  deviation,  where  the 
difference  was  consistently  about  the  mean. 

The  normalized  mean  difference  (Norm  Biff  =  Mean  Biff  / 
Std.  Bev.  Biff)  is  a  measure  of  how  damaging  a  heuristic  is, 
in  terms  of  ranking  moves.  The  relatively  most  damaging 
heuristics,  according  to  this  measure,  were  heuristics  6, 
12,  13,  14,  7,  and  16.  The  least  damaging  were  21,  4,  11, 
19,  3,  and  18. 

We  can  see  in  Tables  4.2.1  and  4.2.2  that  the  standard 
deviations  of  the  differences  are  consistently  much  larger 
than  the  mean  differences.  This  suggests  that  +he 
heuristics  are  deficient  in  identifying  the  master  move. 
Thus  we  have  corroboration  of  our  earlier  conclusions: 
errors  in  scoring  are  due  in  large  part  to  intrinsic 
problems  in  the  heuristics,  rather  than  to  improper  choices 
of  weightings  in  the  scoring  polynomial. 
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In  comparing  the  results  in  tables  4.2.1  and  4.2 
there  are  a  few  interesting  differences  in  the  means  of 
CHUTE  scores  and  the  master  scores.  For  instance, 
heuristic  12,  the  means  for  the  cases  where  CHUTE  and 
master  disagree  are  lower  than  the  means  for  all  the  cas 
This  would  suggest  that  the  more  prominent  the  mater 
measure,  the  better  the  chance  that  CHUTE  and  the  mas 
agree.  The  converse  seems  to  be  true  for  heuristic  6,  wh 
the  CHUTE  mean  increases. 
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The  results  of  the 
presented  in  table  4.3  un 

The  highest  positi 
heuristics  7  and  6 (the 
heuristics)  at  .47  and 
negative  correlations  are 
2  (move  from  an  attack)  , 
passed  pawn)  . 


correlation  to  CTW  measure  are 
der  the  F1.2  column. 


ve  correlations  can 

be 

seen  in 

development  and 

the 

attack 

.26  respectively. 

The 

greatest 

heuristic  20  (successive 

moves)  , 

1  (capture  attacker)  , 

and  0 

(attack  a 
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- “CHUTE1 -  ----MASTEB 


HEUP 

NO. 

MEAN 

STD. 

DEV. 

MEAN 

STD. 

DEV. 

MEAN 

DTFE. 

STD. DEV. 
DTFE. 

NO  PM 
DIEE. 

0 

"”780 

9.  0 

"”718 

8.0 

0.  72 

974 

.  08 

1 

6.  50 

16.  0 

4.80 

14.8 

1.70 

13.  0 

.  13 

2 

12.00 

24.  0 

11.10 

23.5 

0.  90 

17.7 

.  05 

3 

4.00 

27.  0 

3.  10 

22.9 

0.90 

21.7 

.  04 

4 

.  80 

4.  0 

.50 

3.  2 

0.  30 

4.  3 

.01 

5 

7.00 

16.  0 

4.93 

14.8 

2.07 

14.6 

.  14 

6 

25.00 

40.  0 

14.00 

37.2 

11.00 

4  1.0 

.  27 

7 

12.00 

22.  0 

7.  20 

22.9 

4.80 

24.8 

.  19 

8 

1.90 

10.  0 

.  07 

11.0 

1. 83 

12.  5 

.  15 

9 

6.  30 

12.  0 

4.60 

11.0 

1.70 

10.8 

.  16 

10 

1. 50 

7.  0 

1.  1C 

6.0 

0.40 

7.  6 

.05 

11 

1.40 

2.  0 

.09 

1.9 

0.05 

2.6 

.  02 

12 

168.00 

3  64. 0 

1 09.00 

339.0 

59.00 

215.8 

.  27 

13 

22.00 

69.  0 

10.00 

48.0 

12.00 

57.  8 

.  21 

14 

-41.00 

164.  0 

-83.00 

245.0 

42.00 

212.  0 

.  20 

15 

3.  50 

18.  0 

0.  10 

15.0 

3.40 

19.5 

.  17 

16 

8.40 

34.  0 

2.  10 

22.0 

6.  30 

33.3 

.  19 

17 

4.  00 

29.  0 

1.20 

30. 0 

2.  80 

32.  0 

.  09 

18 

0.  50 

23.  0 

-0.50 

18.4 

1.00 

22.  9 

.  04 

19 

0.60 

12.  0 

0.28 

7.5 

0.32 

12.  3 

.  03 

20 

-0.  90 

7.0 

-1.70 

9.8 

0.  80 

8.4 

.  10 

21 

0.07 

1.  1 

0.10 

1.3 

-0.03 

1.3 

-.  02 

Poly 

246. 55 

373.  9 

92.  38 

429.6 

154.17 

270.  3 

.  57 

Values  for  raw  heuristic  scores  over 
4295  data  points. 
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- CHUTE -  - - MASTEF - 


HEUF 

NO. 

MEAN 

STD. 

DEV. 

MEAN 

STD. 

DEV. 

MEAN 

DTFF. 

STD. DEV. 
DIFF. 

NO  FM 
DIFF 

0 

T78  6 

10. 0 

”94 

”7.2 

0.91 

11.3” 

.  08 

1 

4.66 

14.  1 

2.24 

9.7 

2.  42 

15.  5 

.  15 

2 

9.  97 

22.  4 

8.  36 

20.7 

1.61 

21.2 

.  07 

3 

4.26 

24.  0 

1.88 

14.4 

2.38 

26.0 

.  09 

4 

0.  96 

4.6 

0.50 

3.  3 

0.46 

5.  1 

.  08 

5 

6.  78 

15.  7 

2.96 

12.2 

3.  82 

17.  4 

.  21 

6 

27.  34 

38.  8 

10.  6  3 

33.2 

16.70 

48.  3 

.  34 

7 

12.60 

22.  0 

5.  60 

21.8 

6.99 

29.5 

.  23 

8 

2.C8 

10.  6 

0.34 

11.9 

1.74 

15.0 

.  11 

9 

5.00 

11.  5 

2.51 

8.2 

2.49 

12.  9 

.  19 

10 

1.  50 

7.  6 

0.  93 

6.1 

0.56 

9.  1 

.  06 

1 1 

0.  16 

2.  5 

0.09 

1.9 

0.06 

3.1 

.  01 

12 

109.  82 

273.  3 

24.81 

192.  2 

85.  01 

254.0 

.  33 

13 

20.45 

66.  1 

4.02 

27.0 

1  6.  42 

68.7 

.  23 

14 

-32.02 

141.  2 

-92.30 

2  59.8 

60.28 

252.  1 

.  23 

15 

4.  56 

19.  3 

-0.38 

15.3 

4.95 

23.2 

.  21 

16 

9.65 

37.  0 

0.56 

17.9 

9.  09 

39.6 

.  22 

17 

4.65 

28.  5 

0.  56 

29.5 

4.08 

38.  3 

.  10 

18 

-0.  12 

23.  0 

-0.  16 

16.0 

0. 04 

27.5 

.  00 

19 

0.  86 

13.8 

0.32 

7.4 

0.  53 

14.8 

.03 

20 

0.65 

6.  0 

-1.83 

9.9 

1.18 

10.  1 

.  11 

21 

0.05 

0.  9 

0.09 

1.4 

-0.03 

1.  5 

-.01 

Poly 

194. 47 

269.  3 

-27.  27 

314.  3 

221 .75 

30C.2 

.  74 

Values  for  raw  heuristic  scores  over 
2986  data  points. 
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4.2.3 

As  de 
the  maste 
the  idea 
outcome  o 
Table  4. 
we  have  a 
heuristic 
"MTR"  den 
A.t  the 
variance , 
The  col 
calculate 


Heuristic  Weightings 


tailed  in  the  second  chapter,  the  heuristics  scored 
r  game  file  and  the  resultant  scores  were  fitted  to 
1  CTW  score  by  means  of  regression  analysis.  The 
f  this  analysis  can  be  seen  in  summary  form  in 
3  and  in  derail  in  Appendix  3.  For  each  test  run, 
set  of  derived  weightings  that  correspond  to  the 
s  included  in  that  run.  The  rows  marked  "ADV"  and 
ote  measures  other  than  the  original  22  heuristics, 
bottom  of  each  column,  we  have  the  percentage  of 
r2,  for  the  particular  independent  variable  set. 
umn  marked  "COR"  indicates  the  correlations 
d  for  each  heuristic  to  the  CTW  score. 
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PUN  NUMBER  (See  Appendix  3) 
COP  r  -----  -  WEIGHTINGS  -  -  -  - 


PI.  2 

PI .  3 

PI .  5 

PI .  6 

PI  .7 

P  2 .  1 

R2. 2 

R2 .  3 

HEUP 

0 

-.23 

-12.0 

-16.0 

- 

- 

-18.0 

- 

-8.0 

1 

-.20 

-17.0 

-8.0 

- 

- 

-14.0 

-20.0 

-28.0 

2 

-.  24 

-5.0 

-5.0 

- 

- 

-7.0 

2.0 

-3.C 

3 

.07 

2.0 

2.  0 

3.  0 

- 

3.  0 

20.0 

2.0 

4 

.03 

39.0 

39. C 

22.  0 

- 

27.0 

-24.0 

67.0 

5 

-.09 

-5.0 

-4.0 

- 

- 

-5.  0 

-16. C 

-1.0 

6 

.  26 

6.0 

5.0 

8.0 

- 

3.0 

0.  1 

9.0 

7 

.47 

20.0 

15.0 

23.  0 

- 

13.  0 

6.0 

23.  0 

8 

.02 

-7.0 

-7.0 

- 

- 

-3.0 

6 . 0 

-24.0 

9 

-.14 

-8.0 

-7.0 

- 

- 

-8.  0 

23.0 

-4.0 

10 

.  12 

13.0 

18.0 

4.  0 

- 

4.0 

- 

18.0 

11 

.  14 

30.0 

46.0 

57.  0 

- 

14.0 

- 

51.0 

12 

.  07 

1.0 

0.  1 

1.  0 

- 

1.0 

1.0 

1.0 

13 

.  24 

2.0 

1.0 

4.  0 

- 

2.  0 

0.1 

2.0 

14 

.07 

0.  1 

-0.1 

0.  1 

- 

0.1 

-1 . 0 

-0.1 

15 

.05 

1.0 

3.0 

3.  0 

- 

-2.C 

- 

6.  0 

16 

.20 

8.0 

6.0 

9.0 

- 

23.0 

32.0 

3.0 

17 

-.16 

-10.0 

-9.0 

- 

- 

-10.  0 

-7.0 

-12. C 

18 

-.01 

-5.0 

-2.C 

- 

- 

-4.0 

-1.0 

-5.0 

19 

.04 

-2.0 

-0.  1 

- 

- 

-1.0 

-4.0 

-6.  0 

20 

-.31 

-15.0 

-10.0 

- 

- 

13.0 

- 

-19.0 

21 

-.  02 

37.0 

59.0 

- 

- 

-106.0 

- 

-27.0 

A IV 

02 

.  44 

- 

1945.0 

- 

3496.0 

- 

- 

- 

03 

.  11 

- 

1  047 . 0 

- 

2316. 0 

- 

- 

- 

04 

-.14 

- 

60.0 

- 

1093.0 

- 

- 

- 

05 

.  14 

- 

1151.0 

- 

2408.0 

- 

— 

- 

M  TP 

.  24 

— 

— 

— 

-145.0 

— 

— 

— 

r  2 

44.7 

50.9 

35.  6 

25.  7 

32.9 

31.8 

56. 

PLIES 

all 

all 

all 

all 

all 

16-60 

<16 

>60 

Note  :  The  scores  for  the  heuristics  should  be  multiplied 

by  .0001.  MTP  is  the  score  for  the  derived 
variable  MA.TPL.  r2  is  the  %  of  reliability  of  the 
derived  polynomial.  COP  is  the  correlation  of  the 
heuristic  to  the  CTW  score. 
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An  important  result  that  is  not  reflected  in  Table  4.3 
is  the  ordering  cf  the  heuristics  in  the  regression.  The 
ordering  of  the  heuristics  in  the  analysis  reflects  how 
well  the  heuristic  scores  account  for  the  remaining 
variance  of  the  dependent  variable,  namely  the  CTW  score. 
This  ordering  can  be  seen  on  the  actual  results  in  Appendix 
3. 

An  overall  feature  of  the  regression  analysis  is  that 
some  heuristics  have  been  assigned  a  negative  coefficient 
for  their  part  in  the  predictive  polynomial,  "'his  is  due  in 
large  part  to  the  mathematics  of  regression,  namelv  that  the 
scores  which  correlate  negatively  with  the  dependent 
variable  will  usually  end  up  with  negative  coefficients  in 
the  regression.  A  tempt ino  conclusion  that  could  be  drawn 
from  this  is  that  the  heuristics  for  which  we  derive  the 
negative  coefficients  were  reversed  in  -‘-heir  scoring.  (We 
assume  that  a  positive  score  is  good  and  a  negative  score  is 
bad.)  Another  possibility  is  that  the  mathematical  model 
used  was  ill  conditioned.  However,  the  same  results  were 
obtained  by  using  two  different  statistical  program 
packages ( SPSS  and  EMD) ,  whose  handling  of  the  regression 
problem  is  different. 

The  order  of  the  heuristics  (in  R1.3)  suggests  that 
development,  successive  moves,  attack  and  material  balance 
were  the  most  important  considerations  in  predicting  the  CTW 


-82- 


score.  (However, the  inclusion  of  successive  moves  (heuristic 
20)  is  suspect  due  to  the  non-cumulat i veness  of  this 
particular  heuristic.) 

Note  the  low  predictive  power  of  the  polynomial 
weightings  derived  in  FI. 3.  The  score  of  44.715  for  r2  is  the 
perce n tage  of  the  variance  that  the  polynomial  accounts  for. 
This  figure  represents  the  expected  reliability  of  the 
polynomial.  In  a  statistical  sense,  this  low  a  figure  means 
that  the  independent  variables  included  in  the  analysis  were 
not  a  large  enough  subset  of  all  the  factors  that  determine 
the  dependent  variable.  In  other  words,  the  heuristic  scores 
don't  provide  enough  information  for  the  prediction  of  the 
CTW  score.  This  could  point  to  the  need  for  more  heuristics 
or  for  more  accurate  measures  from  the  present  heuristics. 

The  latter  alternative  is  supported  by  the  obviously  low 
correlation  of  some  heuristics  to  the  CTW.  Tor  instance,  one 
would  feel  that  the  material  balance  heuristics,  heuristics 
12  and  14,  should  have  a  fairly  high  correlation  to  winning, 
rather  than  the  actual  results  of  .07  and  .07.  This 
intuitive  notion,  along  with  some  supporting  evidence  will 
be  discusssed  later.  Also  it  is  possible  that  the 
relationship  between  heuristics  and  the  CTW  score  was  of  a 
non-linear  nature.  This  possibility  will  be  explored  in 
section  4.4. 
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We  now  turn  to  the  variations  where  source  data 
selection  and  heuristic  inclusion  are  varied.  The  first  of 
these  variations,  (FI. 5),  was  an  attempt  to  increase  the 
reliability  of  the  predictive  pclynomial  by  adding  tha 
calculated  material  "heuristics"  A.DV02  to  AD  VO  5 .  This 
variation  gave  rise  to  an  increase  from  45%  to  51%  in  r2. 
The  ordering  of  the  heuristics  remained  much  the  same  as 
before . 

The  high  correlation  of  the  calculated  material  balance 
"heuristics'’  confirms  the  intuitive  notion  of  the  importance 
of  material.  A  most  surprising  result  was  the  very  high 
correlation  of  the  pawn  advantage  to  the  CTW  score.  The 
results  suggest  that  a  partially  complete  order  of 
importance  of  material  considerations  is  pawns,  bishops,  and 
rooks.  This  is  quite  contrary  to  the  intuitive  measures 
used  to  date  in  assessing  piece  importance,  and  may  be  a 
consequence  of  the  relative  scarcity  of  master  positions  in 
which  either  side  is  at  a  disadvantage  of  a  piece  or  more. 

Another  variation  of  the  experiment,  (PI. 6),  was  the 
selection  of  only  those  heuristics  which  had  a  positive 
correlation  to  the  CTW  score  for  inclusion  in  the  regression 
analysis.  The  r2  of  35.6%  confirms  that  these  heuristics 
alone  are  not  enough  for  a  viable  polynomial. 
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In  PI. 7,  we  tried  to  assess  the  viability  of  using  just 
the  simple,  calculated  material  "heuristics".  The  result  of 
25*  confirms  the  importance  of  these  measures,  especially 
when  this  figure  is  compared  to  the  51%  that  is  the  best 
result  to  da+e  for  the  entire  complement  of  heuristics. 

Finally,  we  have  in  P2.1,  P2.2  and  P2.3.  the  results  of 
the  data  selection  variation.  It  is  apparent  that  the 
scoring  polynomial  is  most  efficient  in  the  endgame.  The 
reliability  result  of  56%  in  P2.3  is  the  best  for  all 
variat ions . 

Pecall  that  the  weightings  assigned  to  heuristics  can  be 
regarded  as  a  measure  of  their  effectiveness.  The  results 
of  R2.1,  R2. 2  and  P2.3  show  that  many  heuristics  fluctuate 
guite  dramatically  in  their  effectiveness  throughout  the 
phases  of  a  game.  Only  one,  heuristic  12,  is  consistent 
over  all  phases. 

Seme  heuristics  are  rated  highest  in  the  endgame: 
4,6,7,10,11  and  15.  (The  importance  of  7,  the  development 
heuristic,  in  the  endgame  is  a  very  interesting  result).  In 
the  opening,  we  have  2,  3,  8,  9,  and  16  at  their  best.  In 
the  midgame,  only  13,  14,  and  20  are  at  their  peak. 

These  results  point  out  that  many  heuristics  are  most 
effective  at  the  "wrong"  times.  For  instance,  heuristic  2C 


(discourage  moves  with  same  piece)  was  designed  specifically 
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for  the  opening,  yet  it  is  most  prominent  in  the  midgame. 
Heuristic  6  (evaluate  an  attack),  on  the  other  hand,  should 
be  most  prominent  in  the  midgame,  yet  it  is  strongest  in  the 
endgame.  This  suggests  that  there  may  be  a  need  to  control 
when  a  heuristic  is  used,  depending  on  game  phase. 

4.2.4  Uncertainty 

The  coefficients  derived  from  R1.3  in  the  regression 
experiments  were  now  used  to  derive  an  uncertainty 
polynomial.  The  result  of  the  regression  (seen  as  P3.1  in 
Appendix  3)  shows  that  the  difference  between  the  predictor 
polynomial  and  the  ideal  score  was  only  fractionally 
accounted  for  by  the  variations  in  the  heuristics.  The 
variance  of  the  r2  error  measure  was  predictable  with  only 
.39%  reliability.  The  heuristics  which  most  contributed  to 
this  uncertainty  were  7,12,19,14,16,21  and  01.  Of  course 
wi+h  such  a  low  reliability,  this  ordering  is  hardly 
conclusive . 

These  results  strongly  suggest  that  specialized 
heuristics  whose  sole  function  is  the  prediction  of 
uncertainty  would  have  to  be  developed  to  make  the  CTW 
strategy  viable.  Also  we  must  remember  that  the  weightings 
used  were  not  reliable  past  51%,  hence  we  may  expect  better 
results  in  uncertainty  prediction  once  a  better  scoring 
polynomial  is  developed.  However,  it  is  fairly  evident  that 
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errors  for  a  heuristic  score  are  not  predictable  from  that 
score.  Although  the  aspect  that  each  heuristic  measures  will 
probably  have  a  definite  uncertainty  associated  with  it,  it 
is  not  evident  in  the  present  heuristics. 

4 • 3  Interpretation  of  Results 

The  results  have  shown  that  the  methods  used  in 
polynomial  and  heuristic  evaluation  were  effective  tools  in 
confirming  and  isolating  problems.  In  general,  this  means 
that  such  measures  can  be  valuable  tools  in  the  process  of 
writing  and  improving  chess  programs.  In  the  specific  case, 
the  results  have  served  the  purpose  of  evaluating  CHUTE  and 
shedding  light  on  the  feasibility  of  the  CTW  strategy. 

4.3.1  General  F valuation 

The  most  significant  result  was  the  discovery  that  we 
could  exclude  certain  heuristics  from  the  scoring  polynomial 
without  lowering  the  polynomial's  ranking  effectiveness. 
This  suggests  deep  problems  in  the  reliability  of  the 
heuristics  that  were  excluded.  More  importantly,  however,  it 
was  the  regression  analysis  that  predicted  which  heuristics 
could  be  excluded.  This  is  a  confirmation  of  the  accuracy  of 
the  correlation  to  CTW  score  as  a  measure  of  individual 


heuristic  effectiveness 
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The  rankings  derived  over  the  whole  file  allow  us  to 
calculate  some  reasonable  estimations  of  the  probability  of 
type  la  and  2a  errors  in  CHUTE.  The  probability  of  type  la 
error  for  a  fanout  of  7  (the  value  used  in  the  1974  ACM 
tournament)  is  A_(7)  =  .25.  Similarly,  the  result  for  A.(1) 

=  .69.  Since  A-(1)  is  the  same  as  the  probability  of  type  2a 
errors,  we  have  a  measure  of  the  error  at  the  terminal  node 
level  s. 

The  problem  of  heuristic  incompleteness  had  a  definite 
effect  on  the  analysis  and  the  ranking  experiments.  One  of 
the  underlying  premises  of  the  scoring  is  that  if  the 
measure  of  an  aspect  of  a  position  is  complete,  then  this 
measure  should  positively  correlate  with  the  CTW  score.  In 
ether  words,  we  are  assuming  that  the  winner  of  a  game  will 
usually  be  better  off  in  the  majority  of  aspects  of  the 
position  and  certainly,  over  a  large  number  of  games,  this 
will  be  true  of  all  aspects.  If  an  aspect  does  not  have  this 
property,  then  the  aspect  is  either  irrelevant  to  the 
winning  of  the  game  or  is  scored  incorrectly.  Note  that 
the  concepts  'complete'  and  'effective'  are  not  identical:  a 
heuristic  may  be  incomplete  but  effective.  Completeness  is 
one  factor  in  determining  the  effectiveness  of  a  heuristic. 

We  found  that  in  some  cases,  there  was  contradictory 
evidence  for  the  effectiveness  of  a  heuristic.  For 
instance,  its  design  may  heve  been  incomplete  whereas  it 
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correlated  positively  to  CTW.  This  led  us  to  conclude  that 
there  may  be  "degrees"  of  effectiveness. 

These  degrees  of  effectiveness  may  be  due  to  the  non¬ 
symmetry  of  aspects.  For  instance,  examine  an  aspect  such 
as  "attack  a  piece"  (heuristic  6).  We  want  to  measure  both 
the  crea+ion  and  the  destruction  of  a  line  of  attack  in 
order  to  have  a  complete  assessment  of  this  aspect. 
However,  it  seems  unreasonable  to  put  the  same  weight  on 
destruction  as  on  creation,  since  creation  will  usually  have 
a  greater  and  more  lasting  impact.  An  attack  on  a  piece  has 
"residual"  effects  on  the  subseguent  positions  that  persist 
after  the  attack  is  removed.  Therefore,  if  a  heuristic  only 
measures  creation  of  an  attack,  then  it  may  still  reflect  a 
large  portion  of  the  aspect.  Such  a  heuristic  may  be  called 
reasonably  effective. 

4.3.2  Specific  Evaluation  of  Heuristics 

The  results  allow  us  to  evaluate  CHUTE'S  performance  in 
terms  of  two  of  its  components:  its  heuristics  and  its 

scoring  polynomial.  We  now  examine  each  heuristic  in 
deta il. 

Heuristic  0  (attack  a  passed  pawn)  -  This  heuristic  was 

among  those  that  were  inherently  incomplete  by  design. 


It  also  correlated  negatively  to  the  CTW  score,  further 


—  89  — 


indicating  ineffectiveness.  Begression  consistently 
assigned  a  negative  weighting  to  it.  The  heuristic's 
overall  ineffectiveness  is  probably  attributable  to  its 
incomplete  design. 

Heuristic  1  (attack  an  attacking  piece)  -  This  heuristic  was 
also  incomplete  by  design  and  the  other  measures 
supported  this  conclusion. 

Heuristic  2  (leave  an  attack)  -  This  heuristic  was 
incomplete  by  design.  It  correlated  negatively  to  the 
CTW;  however,  it  did  get  a  positive  weighting  from 
regression  in  the  opening  game  phase. 

Heuristic  3  (expose  to  attack  or  defence)  -  Heuristics  2,  3, 
and  4  all  measure  the  same  aspect:  lines  of  attack  and 
defence.  Therefore,  we  ranked  heuristic  3  as  incomplete 
by  design.  However,  it  did  correlate  positively  with 
CTW.  It  consistently  received  a  positive  weighting  from 
regression,  the  highest  weighting  being  in  the  opening 
game  phase.  It  was  one  of  the  least  damaging  heuristics 
according  to  the  mean  difference  measure.  This 
heuristic  seems  fairly  effective. 

Heuristic  4  (interpose  an  attack)  -  This  heuristic,  like  2 
and  3,  was  deemed  incomplete  by  design.  Its  correlation 
to  CTW  was  positive,  but  very  low(.03).  It  seems  most 
effective  in  the  end  game.  In  the  opening  phase,  it  was 
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assigned  a  negative  weighting  by  regression,  indicating 
that  the  heuristic  is  inconsistent  over  game  phases. 

Heuristic  5  (add/delete  defenders)  -  This  heuristic  seems 
complete  on  its  own,  though  it  does  overlap  with 
heuristics  2,  3,  and  4.  its  negative  correlation  to  CTW 
and  its  negative  weighting  in  all  regression  runs 
suggest  that  it  is  ineffective.  This  ineffectiveness 
could  be  due  to  two  things:  its  overlap  to  other 
heuristics  or  problems  in  score  calculations. 

Heuristic  6  (attack  an  opponent  piece)  -  This  heuristic  was 
found  incomplete  by  design,  since  it  does  not  detect 
removal  of  an  attack.  However,  its  fairly  high  CTW 
correlation  and  its  consistently  positive  weighting 
suggest  it  is  effective.  This  was  one  of  the  most 
damaging  heuristics  according  to  the  mean  difference 
measure.  Overall,  it  is  a  fairly  effective  heuristic, 
but  not  consistent  over  game  phase. 


Heuristic  7  (development)  -  This  heuristic  turned  out  to  be 
the  most  effective  heuristic.  It  has  a  very  high 
correlation  to  CTW  (.47)  indicating  that  it  is  highly 
effective.  The  regression  found  it  most  effective  in 
the  endgame. 


Heuristic  8  (freedom  for  bishops,  knights,  etc.)  -  Although 
this  heuristic  is  complete  by  design,  it  shows  a  very 
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low  CTW  correlation  and  negative  weightings  in  all  but 
the  opening  game  phase.  This  suggests  that  perhaps  this 

i 

aspect  is  only  relevant  in  the  opening  or  that  heuristic 
8  scores  the  aspect  incorrectly. 

Heuristic  9  (attack  last  moved  piece)  -  We  can  not  interpret 
the  results  for  this  heuristic  since  we  found  it  is  not 
readily  convertable  to  positional  scoring. 

Heuristic  10  (rook  on  open  file)  -  This  is  another  case 
where  we  found  design  incompleteness.  The  regression 
analysis  showed  that  this  heuristic  is  most  important  in 
the  endgame  (as  would  be  expected) .  Overall,  it  seems 
like  a  fairly  effective  heuristic,  yet  it  was  one  of  the 
"excluded  heuristics"  in  the  ranking  variation. 

Heuristic  11  (rook  on  own  passed  pawn  file)  -  This  heuristic 


is  similar 

to  heuristic  10. 

It  too 

is 

partially 

incom  plete , 

yet  effective. 

Relative 

to 

the  mean 

difference 

measure,  it  was  one 

of  the  least 

damaging. 

Again,  it  is  most  prominent  in  the  endgame. 

Heuristic  12  (value  for  captured  piece)  -  This  heuristic, 
alonq  with  heuristic  14,  constitutes  the  material 
advantage  measure.  It  correlated  positively  with  CTW 
and  the  regression  analysis  showed  it  to  be  consistent 
over  game  phase.  The  results  of  the  mean  and  difference 
in  mean  analysis  (Table  4.2)  showed  the  heuristic  to  be, 
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highly  erratic  in  identifying  the  master  move.  In  terms 
of  the  mean  difference  measure,  it  was  one  of  the  most 
damaging.  The  results  from  adding  alternate  material 
advantage  measures  to  the  regression  analysis  also 
suggest  that  this  heuristic  is  not  very  accurate  in  its 
score  calculations. 


Heuristic  13  (specials:  castling,  en-passant,  etc.)  -  This 
heuristic  seems  complete  and  effective.  It  seems  most 
important  in  the  end  and  mid  games,  according  to  the 
regression  variation  results. 


Heuristic  14  (sacrifice  vs.  save)  -  This  is  the  other  half 
of  the  material  advantage  heuristic  set.  It  has  a  very 
low,  though  positive,  correlation  to  CTW.  Its 
performance  in  the  regression  is  highly  erratic.  It 
seems  that  there  are  problems  in  the  scoring  of  the 
heuristic  which  are  not  totally  attributable  to  its 
incompleteness. 


Heuristic  15  (double  up  pieces)  -  This  heuristic  was  found 
incomplete  by  design.  The  CTW  correlation,  however, 
suggested  that  it  is  effective.  The  heuristic  was  most 
effective  in  the  endgame,  while  in  the  midgame,  its 
scoring  seemed  to  be  erratic. 


Heuristic  16  (attack  sguares  around  the  king)  -  Again, 
incomplete  by  design.  The  heuristic  was  consistently 


posit ive 
effective 
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It  seemed  most 


in  the  regression  runs, 
in  the  opening  and  least  effective  in  the 
endgame.  It  is  one  of  the  most  damaging  heuristic  in 
terms  of  the  mean  difference  measure.  This  result 
suggests  that  there  are  problems  in  the  heuristic's 
score  calculations. 


Heuristic  17  (create/avoid 
measure  the  same  aspect: 
incomplete  by  design, 
results  were  consistently 
incompleteness  is  sever 
the  score  calculations. 


pins)  -  Heuristics  17  and  19 
pins.  Hence,  heuristic  17  is 
The  correlation  and  regression 
negative,  suggesting  that  the 
e  and/or  there  are  problems  in 


Heuristic  18  (pawn  considerations)  -  This  heuristic  was 
found  to  be  incomplete  by  design.  The  CTW  correlation 
and  regression  results  confirmed  this  finding. 

Heuristic  19  (block  pins)  -  This  heuristic  is  the  other  half 
of  the  pin  heuristic  pair.  It  correlated  positively  to 
CTW,  though  the  regression  results  were  consistently 
negative.  This  would  suggest  scoring  problems  as  well 
as  incompleteness. 

Heuristic  20  and  21  -  Eoth  these  heuristics  were  not  readily 
convertable  to  positional  scoring,  hence  the  results 
not  be  properly  interpreted. 


can 
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Most  of  the  problems  with  the  scoring  polynomial  can  be 
directly  attributed  to  heuristic  problems  rather  than 
problems  with  the  weightings.  The  polynomial  effectiveness 
is  measurable  in  the  ranking  results.  We  were  not  able  to 
come  up  with  better  weightings  through  regression,  though 
the  heuristic  exclusion  results  certainly  show  that  there  is 
room  for  improvement. 

The  evaluation  of  the  heuristics  point  out  the 
importance  of  several  considerations  in  search  strategy. 
First,  we  need  to  consider  the  importance  of  game  phase  in 
the  inclusion  of  heuristics  in  the  polynomial.  The  dramatic 
variation  in  the  effectiveness  of  some  of  the  heuristics 
over  game  phase  can  only  lead  to  inconsistent  play. 

Second,  we  have  the  disparity  that  some  heuristics  are 
better  at  ranking  (measured  by  the  mean  difference)  than  at 
absolute  scoring  of  moves  (measured  by  CTW  correlation). 
Hence  they  vary  in  their  applicability  at  different  levels 
of  the  lookahead  tree.  As  an  example,  heuristic  6  is  good 
at  absolute  measures  (CTW  correlation  of  .26)  yet  it  is  very 
poor  at  ranking. 


Finally,  the  results  of  calculating  type  1  error 
demonstrate  the  decreasing  returns  from  increasing  fanout. 
This  is  a  very  undesireable  characteristic  of  the  evaluation 
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function,  since  the  quality 
proportionately  with  the  effort 


of  play  will  not 
spent  in  searching. 


improve 


4 . 4  Weaknesses  and  Limitations  of  the  Method 


The  methods  we  have  outlined  have  certain  shortcomings. 
The  most  obvious  of  these  is  our  inability  to  assign  an  all- 
encompassing  score  for  program  effectiveness.  While  we  have 
evaluated  the  components,  we  have  not  established  a  scale  on 
which  chess  programs  could  be  compared. 


The  first  limitation  is  the  consequence  of  the  lookahead 
problem.  While  we  have  shown  how  error  bounds  in  lookahead 
can  be  estimated,  we  have  not  established  a  measure  of 
lookahead  efficiency.  The  methods  do  not  allow  us  to 
determine  what  improvement,  if  any,  is  realized  by 
lookahead. 


scale 
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i  s 
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not 
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to  support  this 
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(N) )  would  have  been  better. 
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we  used  was  not  indicative  of  computer  chess  play.  We  have 
not  included  a  lot  of  possible  positions  that  chess  masters 
never  encounter  due  to  the  high  quality  of  master  play. 
Thus  the  reliability  of  the  regression  results  may  not  be  as 
good  as  we  would  want.  The  inclusion  of  non-master 
positions  may  produce  better  results. 

There  is  further  the  possibility  that  the  regression 
over  master  moves  is  simply  a  measure  of  the  heuristics's 
ability  to  detect  winning  positions  "after  the  fact",  rather 
than  their  ability  to  predict  winning  moves.  Thus 
regression  may  only  be  usefull  for  determining  how  effective 
a  heuristic  is  as  a  terminal  node  evaluator,  not  how 
effective  it  is  as  a  pruning  tool. 

Another  objection  might  be  the  size  of  the  library.  One 
hundred  games  may  be  toe  few  or  too  many.  We  have  no 
evidence  either  way. 
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5.0  CONCLUSIONS  AND  IUTUBE  WOPK 

The  experiments  outlined  in  this  thesis  served  both  to 
develop  an  evaluation  methodology  and  to  establish  the 
groundwork  for  a  further  possible  set  of  refinements  to 
CHUTE.  The  results  lead  to  some  specific  recommendations  as 
well  as  directions  for  future  research. 

In  this  chapter,  we  first  present  some  recommendations 
for  improving  the  current  heuristic  set  in  CHUTE.  We  then 
discuss  an  approach  to  the  position  uncertainty  problem, 
followed  by  suggestions  for  the  realization  of  a  chess 
position  library.  Next,  we  outline  an  experiment  that 
pertains  specifically  to  the  problems  of  implementing  the 
CTW  strategy.  And  finally,  we  present  our  general 
conclusions  and  some  directions  for  future  research. 

5 . 1  Heuristic  Improvement 

The  first  order  of  business  should  be  to  improve  the 
current  set  of  heuristics.  This  would  entail  making  sure 
that  each  heuristic  was  complete  for  its  aspect.  For 
instance,  heuristic  18  should  score  the  destruction  of  pawn 
structures  as  well  as  their  creation.  Since  some  aspects 
are  scored  by  more  than  one  heuristic,  the  scoring  of 
certain  heuristics  should  be  combined.  For  instance. 
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heuristics  17  and  19,  which  score  pins;  heuristics  3  and  4, 
which  score  interposes. 

One  of  the  potential  problems  of  ensuring  that  a 
heuristic  is  complete  that  will  have  to  be  dealt  with  is  the 
detection  of  secondary  effects.  The  effects  of  certain 
moves  reflect  on  an  aspect  in  a  very  roundabout  way.  As  an 
illustration  ,  consider  FIG.  5.1. 

—  **  --  ★  *  — . 

**  BP  **  —  ** 

.  ..  **  bp  **  —  BK 


**  —  wo  --  ** 


WF  --  **  --  ** 


•  •  ® 


FIG.  5.1  An  Fxa mple  Position 

By  moving  WF  to  pin  BF  to  the  BK,  we  are,  as  a  secondary 
effect,  removing  the  threat  to  the  WQ.  At  present,  the  only 
method  of  removing  the  threat  on  WQ  that  is  rated  highly  by 
the  CHUTE  heuristics  is  moving  the  queen.  This  situation 
actually  occurred  in  one  of  the  library  games.  The  master 
move  was  to  pin  the  BF.  CHUTE’S  choice  was  to  move  the 
gueen.  CHUTE  did  not  even  consider  the  master  move  as  one 
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The  manner  in  which  heuristics  are  used  also  should  be 
improved.  The  results  have  pointed  out  the  difference  in 
effectiveness  of  many  heuristics  over  game  phase.  Hence, 
there  is  the  need  to  vary  weightings  and  heuristic  inclusion 
in  the  scoring  polynomial  as  a  function  of  ply. 

The  guestion  of  how  heuristics  are  applied  in  the  qame 
tree  also  bears  attention.  It  is  clear  from  the  results 
that  a  heuristic’s  ability  to  rank  moves  (used  in  pruning) 
is  not  necessarily  the  same  as  its  ability  to  assign  an 
absolute  score  (used  in  terminal  evaluation) .  Therefore,  we 
might  want  to  use  two  polynomials,  each  consisting  of 
different  heuristics  and/or  different  heuristic  weightings. 
An  important  guality  that  we  would  want  in  the  pruning 
polynomial  would  be  "robustness":  the  ability  to  decrease 
type  1  error  with  increased  fanout  in  a  near  linear  fashion. 
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5 .  2  Uncertainty  Heuristics 

The  development  of  uncertainty  heuristics  would  have  +c 
proceed  along  the  same  lines  as  scoring  heuristic 
development. 

It  is  evident  from  the  results  that  the  actual  scoring 
heuristics  are  simply  inappropriate  for  the  uncertainty 
measure  of  a  position.  We  thus  have  no  evidence  to  support 
the  claim  that  "in  many  situations  it  is  easier  to 
statically  detect  that  there  is  much  uncertainty  than  it  is 
to  statically  calculate  the  effect  of  playing  the  position 
out"  [Horning72].  To  do  so,  we  would  need  to  devise 
heuristics  that  specialize  in  uncertainty  for  each  of  the 
currently  measured  aspects,  as  well  as  other  aspects.  Some 
suggested  areas  that  could  be  looked  at  would  be  the 
presence  and  magnitude  of  potential  exchanges,  and 
complexity  of  the  board.  The  uncertainty  in  the  case  of  one 
or  more  potential  exchanges  increases  with  the  number  of 
pieces  involved  along  with  the  range  of  values  of  the 
piece  s. 

The  complexity  of  the  board  indicates  how  many  options 
the  opposition  has  in  selecting  his  next  move.  A  factor  of 
complexity  is  the  range  of  damage  that  the  opposition  can- 
do.  As  an  example,  a  low  complexity  position  would  be  one 
where  a  move  is  forced,  e.g. ,  in  the  case  of  a  check  on  the 
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opposition  king.  A  highly  complex  situation  allows  the 
opponent  a  large  number  of  moves,  with  several  of  these 
moves  being  damaging. 


Naturally,  these  measures  would  have  to  be  calculated 
in  a  static  fashion  based  on  the  current  situation  and  the 
proposed  move's  effect  on  it. 


5 . 3  Position  Library 

One  of  the  features 
utilization  of  a  position  lib 

The  utility  of  a  posi 
confined  to  the  search  strate 
programs  employ  a  limited 
strategy. 

The  basic  data  in  sue 
position  itself,  a  suggested 
score  for  these  moves.  Th 
directing  and  limiting  the  pr 
reasonable  move.  If  the  a 
with  the  current  position  wer 
move,  then  there  would  be 
lookahead  and  scoring  part  of 
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If  the  posit ion/move  data  were  gathered  from  mas 
games,  then  such  a  condition  would  arise;  however,  if 
library  were  built  up  dynamically  based  on  the  results 
actual  games  played,  then  more  care  would  have  to  be  ta 
in  using  any  data  derived  from  the  library.  The  obvi 
suggestion  for  a  first  attempt  at  a  position  library  is 
use  the  master  game  library  as  the  starting  set 

position/move  pairs,  thereby  ensuring  a  fairly  reliable  m 
determination  mechanism. 

There  are  certain  problems  in  using  a  position  libr 
that  arise  immediately.  First  there  is  the  problem 
searching  such  a  library.  A  tradeoff  occurs  between  the  s 
of  the  library  and  the  search  time  needed  to  find  match 
Naturally,  the  larger  the  library,  the  better  the  succ 
rate  in  matches,  but  also  the  lonaer  search  times  need 
For  this  reason  the  search  methods  and  the  libr 

organization  have  to  be  fairly  sophisticated.  In  design 
any  search  technigue,  we  could  possibly  take  advantage 
the  ncn-randcmness  of  master  play  to  direct  search. 

The  bound  on  the  search  time  is  dictated  by  w 
percentage  of  the  time  allowed  for  move  selection  we 
willing  +c  spend  in  a  search.  This  percentage  is  in  t 
dependent  on  the  success  rate  of  the  search. 
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A  question  that  arises  is,  what  occurs  in  the  cases 
where  there  is  no  direct  match  with  any  of  the  library 
positions?  There  will  obviously  be  cases  where  the  match 
fails  due  to  differences  in  the  location  of  pieces  that  have 
no  bearing  on  the  "essence"  of  the  position.  How  does  one 
detect  such  situations  and  how  can  this  information  be  used?  x 
This  problem  of  partial  match  is  complex,  since  criteria 
have  to  be  set  for  degree  of  closeness  of  matches  and  degree 
of  applicability  to  the  current  position. 
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purely  on  a  direct  match,  but  the  search  time  and  space 
would  obviously  be  reduced,  perhaps  allowing  a  larger 
library. 

We  could  extend  the  above  method  such  that  more 
information  on  game  features  could  be  used.  This  extra 
information  might  include  the  heuristic  scoring  of  the 
position  aspects.  The  features  would  now  be  organized  by 
generality,  the  most  specific  features  being  the  actual 
position  and  the  more  general  features  being  measures  of  the 
more  general  aspects.  Along  with  this  position  feature 
hierarchy,  a  move  feature  hierarchy  is  also  needed  such  that 
the  levels  of  generality  correspond.  For  each  position 
feature,  there  would  correspond  one  or  more  move  features. 
The  move  feature  generality  might  extend  from  an  actual  move 
to  a  general  feature  such  as  a  recommendation  to  play 
defensively.  If  we  don't  get  an  exact  match  from  the 
library  then  at  least  we  end  up  with  a  list  of  move  features 
that  may  be  helpful  in  ordering  the  candidate  moves. 

The  realization  of  such  a  library  organization  would 
entail  the  development  of  a  cohesive  hierarchy  of  position 
features  and  move  features.  One  suggestion  of  an  approach 
would  be  to  score  the  whole  library  in  terms  of  these 
features,  both  for  positions  and  for  moves.  We  could  then 
statistically  derive  the  relationships  of  the  position 


features  to  the  move  features  to  decide  on  the  hierarchy  of 
the  library. 


The  success  of  such  an  experiment  is  highly  dependent  on 
the  development  of  a  very  comprehensive  set  of  move  and 
position  features.  Furthermore,  we  would  have  £o  ensure 
that  the  scoring  was  done  on  an  "appropriate"  scale.  The 
behaviour  of  the  features  should  be  linear  on  this  scale. 


Before  the  master  game  file  could  be  used  for  a  position 
library,  certain  processing  would  have  to  be  done,  aside 
from  simple  file  formatting.  First,  there  is  the  problem  of 
duplicate  positions  in  the  master  games.  Presumably,  the 
position  library  should  consist  of  unigue  position-move 
data,  hence  the  redundancy  in  position  data  in  the  master 
game  library  would  have  *o  be  eliminated.  This  is  easily 
done;  however,  we  run  into  difficulties  when  we  try  to 
resolve  the  occurrence  of  different  moves  for  the  same 
position  or  different  scores  for  the  same  position-move 
pairs . 


One  possible  strategy  to  be  used  in  deriving  the  library 
is  the  following;  1)  In  the  case  of  duplicate  moves  for  the 
same  position,  take  the  move  with  the  highest  score.  2)  In 
the  case  of  different  scores  for  a  move,  select  the  worst 
score  and  retain  the  difference  between  the  worst  and  the 
best  scores  as  a  measure  of  move  uncertainty.  This  method 
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would  present  the  best  move  for  a 
pessimistic  score  for  the  move  and 
uncertainty  of  the  position-move. 


position,  the  most 
a  measure  of  the 


This  strategy  is  a  rough  first  step  in  establishing  the 
library.  A  further  refinement  might  be  to  give  special 
attention  to  those  position-move  pairs  that  have  a  very  high 
occurrence  rate  in  the  master  games.  (Opening  positions 
should  be  treated  differently,  more  along  the  lines  that  are 
presently  employed  via  an  opening  library.)  Position 
freguency  analysis  would  also  be  very  helpful  in  determining 
any  search  strategies  based  on  likelihood  of  occurrence. 


While  the  above  discussion  has  been  concerned  with  a 
position  library  based  solely  on  well-scored  master  games, 
the  inclusion  of  data  into  the  library  on  a  dynamic  basis  is 
now  examined.  Several  problems  in  this  regard  present 
themselves,  the  most  pressing  being  the  problem  of  selecting 
a  move  to  go  with  a  position.  This  problem  is  trivial  when 
we  are  concerned  with  master  games,  since  our  game  selection 
procedures  ensure  that  we  have  the  "good”  move  for  each 
position.  However,  in  the  dynamic  case,  the  point  at  which 
we  would  wish  to  add  a  position  would  be  during  a  game, and 
hence  the  reliability  of  the  move  included  with  the  position 
is  subject  to  all  the  problems  of  move  selection  for  the 
program  itself.  If  the  program  is  engaged  in  actual  play, 
we  would  not  want  to  waste  time  adding  to  the  library. 
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after 
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Human 

analysis  of 

the  game  could 

then 

be 

employed 

to  correctly  score  moves. 

5 . 4  The  Overlap  Experiment 

This  experiment  could  be  used  to  measure  the 
effectiveness  of  the  scoring  and  uncertainty  polynomials 
with  regard  to  their  use  in  the  CTW  strategy.  According  to 
the  strategy,  the  moves  selected  for  refinement  are  those 
moves  whose  H  (highest  expected  value  of  the  move)  score 
exceeds  +  he  I  (lowest  expected  score  for  the  move)  score  of 
the  move  with  the  highest  H  score.  The  L  score  of  the 
highest  move  is  found  as 

Ptop  -  KPtop 

where  Ptop  =  wChO+  w1h1+w2h2+. . . +w21h21 
and  Dtop  =  sgrt (  BOhO  +  B  1h 1  + . . . B21 h2 1) 

("top"  pertains  to  the  move  rated  best  by  the 

polynomial) 

This  is  simply  the  static  score  for  the  move  minus  the 
uncertainty  score  for  the  move.  Similarly,  the  H  score  for 
any  move  n  is  found  as 


— 1 08—— 


Pn  +  KDn 

In  both  cases  ,  K  is  some  positive  value. 

The  aim  of  this  experiment  would  be  to  determine  the 
smallest  value  for  K  such  that  the  set  of  "overlapping” 
moves  almost  always  contains  the  master, or  best,  move.  We 
would  want  to  minimize  K  such  that 

Ptop  -  K*Dtop  <  Pmast  t  K*Dmast 

("mast"  pertains  to  the  master  move) 

This  minimization  could  be  done  by  tabulating  K's  over  a 
data  base.  This  would  give  a  measure  of  the  likelihood  that 
the  above  relation  holds  for  each  value  of  K. 

5 . 5  Conclusions 

Our  aim  in  this  research  was  to  develop  and  test 
heuristic  evaluation  measures.  These  measures  proved  to  be 
helpful  in  evaluating  the  effectiveness  of  CHUTE.  We  also 
explored  the  problem  of  implementing  the  CTW  strategy,  but 
without  any  conclusive  results.  The  viability  of  -t-he  CTW 
strategy  is  still  an  open  guestion. 

We  believe  that  future  chess  program  research  should 
proceed  in  three  distinct  yet  highly  interrelated 


directions 
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The  first  of  these  is  the  problem  of  static  evaluation. 
There  is  certainly  a  need  for  a  theory  of  position 
evaluation  in  chess  specific  terms.  Very  little  has  been 
done  in  this  area  to  date.  We  can  foresee  the  same  types  of 
problems  emerginq  as  have  emerged  in  other  AI  research.  The 
most  significant  of  these  is  the  problem  of  representing 
knowledge  about  the  game. 

The  second  direction  is  in  the  area  of  lookahead 
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Proqress  in  the  theory  of  any 
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Appendix  1 

Outline  of  the  Closeness  To  Win  Strategy 
For  a  Chess  Playing  Program 

by  J.J.  Horning  (see  bibliography) 

1.  The  evaluation  measure  takes  on  values  in  the  interval  of 

(-1,1)  ,  where  0  indicates  an  expected  draw,  1/N  an 
expected  win  in  N  plies,  and  -1/N  an  expected  loss  in  N 
plies. 

2.  For  each  position  we  compute  "optimistic"  and 
"pessimistic"  values,  H  and  1,  denoting  the  range  of  its 
possibilities;  H-L  measures  the  uncertainty  of  our 
evaluation . 

3.  A  value  may  be  "backed-up"  one  ply  by  applying  the 
function  B  (X)  =  -X/(1+|X|). 

4.  Given  the  values  of  a  position's  successors,  its  refined 
values  are  given  by: 

H  =  B  (max  (Is)  ) 

/ 

I  =  B  (max  (Hs)  ) 

5.  To  select  a  move,  we  attempt  to  find  one  leading  to  a 
successor  b  such  that  Lb  >  Hs  for  every  other  successor 
s,  i.e. ,  one  whose  pessimistic  value  exceeds  the 
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optimistic  values  of  all  its  rivals.  Thus,  the  first 
step  is  to  evaluate  all  successors. 

6.  Let  Lp  be  the  maximum  L  for  any  successor  (p  is  the 
"presumptive”  move) .  The  moves  for  which  He  >  Lp  are  the 
"candidate"  moves,  and  we  wish  to  reduce  this  set  to  a 
single  element,  by  applying  a  strategy  of  progressive 
deepening.  If  the  allotted  time  is  spent  without 
eliminating  all  but  one  candidate,  the  choice  among  the 
candidates  is  made  "stylistically". 

7.  At  each  step  of  the  algorithm,  the  candidate  with  the 
largest  H-L  (greatest  uncertainty)  is  selected  for 
refinement  and  the  results  used  to  adjust  the  set  of 
candidates. 


8.  To  refine  a  position: 


(a)  if  it  has  not  been  previously  refined,  enumerate 
its  successors  and  determine  which  ones  have  been 
previously  evaluated  (or  are  in  the  position 
library) .  Estimate  the  value  of  each  unevaluated 
position  by  means  of  the  "move  plausibility" 
function,  which  yields  h>  H.  All  moves  with  h  >  Lp 
(the  local  best  pessimistic  value)  are  statically 
evaluated.  Those  with  He  >  Lp  become  the  local 
candidate  set. 


—  118  — 


(b)  if  it  has  been  previously  refined,  the  local 
"hopeful"  move  (i.e.  the  one  with  maximum  He)  is 
refined,  and  its  new  value  used  to  adjust  the  local 
candidate  set. 

In  either  case,  the  new  Ip  and  max (He)  are  backed  up  to 
refine  its  value.  (In  the  exceptional  case  that 
refinement  actually  increases  H  or  decreases  I  for  the 
position,  the  anomoly  is  noted  for  further  study,  and 
the  position  is  immediately  re- refined) . 

9.  All  positions  which  have  been  evaluated  are  saved.  Once 
play  progresses  to  the  point  where  a  position  is  no 
longer  reachable,  it  is  retired  to  a  position  library  on 
secondary  storage. 

10.  Before  finally  selecting  a  move,  all  of  the  selected 
move’s  successors  are  statically  evaluated  (to  avoid  ? 
moves  which  are  immediately  losing,  and  to  screen 
opponent’s  !  moves). 

11.  The  time  while  the  opponent  is  selecting  his  move  is 
used  to  evaluate  more  relevant  positions,  simply  by 
attempting  to  choose  his  move  for  him. 

Ccmme  nt s 

This  strategy  attempts  to  do  several  things,  mostly 

aimed  at  reducing  the  amount  of  search  needed  for  good  play. 


The  evaluation  measure  and  back-up  rule  also  enforce  a  sense 
of  direction,  and  permit  comparison  of  values  backed  up 
different  numbers  of  levels. 

The  use  of  a  pair  of  values  for  each  position 
generalizes  the  "live'V'dead”  position  distinction  of 
Shannon,  on  the  basis  of  the  observation  that  in  many 
situations  it  is  easier  to  statically  detect  that  there  is 
much  uncertainty  than  it  is  to  statically  calculate  the 
effect  of  playing  the  position  out  (e.g.,  checks,  forks, 
major  piece  trades) . 

By  only  refining  positions  whose  values  potentially 
affect  the  choice  of  move,  we  retard  or  avoid  the 
exponential  growth  which  characterizes  depth-first  search 
strategies.  The  selectivity  of  this  strategy  depends 
heavily  on  keeping  most  H-L  values  small;  its  reliability  on 
never  having  H  or  I  dishonest.  Thus,  improved  static 
evaluation  measures  will  pay  off  directly  in  reduced 
searching,  and  we  can  afford  to  invest  more  time  in  careful 
evaluation  of  each  position  encountered. 

The  role  of  the  move  plausibility  function  is  to  screen 
out  the  majority  of  moves  which  have  nothing  to  recommend 
them,  substantially  reducing  the  number  of  careful  static 
evaluations  which  must  be  done.  It  should  never  be  low,  but 
can  occasionally  be  very  high  without  causing  any  harm(i.e. , 
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i4:  need  not  check  for  negative  side  effects  of  a  move,  only 
for  potential  benefits) . 

We  do  not  treat  a  "game  tree"  but  a  directed  graph;  thus 
we  avoid  continually  re-evaluating  the  same  positions.  If 
the  number  of  positions  evaluated  in  choosing  a  move  is  in 
the  range  103-10*,  we  can  think  of  keeping  them  all  in  the 
main  memory.  Furthermore,  "rote  memory"  of  all  the 
positions  evaluated  in  real  play,  and  in  following  book 
games,  can  easily  be  held  in  secondary  storage. 

Since  the  evaluations  for  any  move  are  retained  and  form 
the  basis  of  the  search  for  the  next  move,  we  can  expect 
coherent  seguences  of  play  and  even  combinations  to  emerge. 
This  is  still  short  of  true  planning,  however. 

When  time  constraints  make  it  impractical  to  reduce  the 
candidate  set  to  a  single  element,  there  is  no  clearly  best 
move  and  the  stylistic  function  is  called.  It  serves  several 
purposes  (a)  to  control  bold  vs.  cautious  style  of  play  by 
means  of  weights  assigned  to  H  and  L  in  the  selection;  (b) 
to  give  a  "personality"  to  the  program  by  emphasising 
certain  types  of  moves;  (c)  to  introduce  some  planning  at 
points  where  relatively  free  choices  can  be  made;  (d)  to 
introduce  variety,  by  means  of  controlled  randomness.  None 
of  these  stylistic  influences  are  appropriate  when  a  single 
move  is  clearly  best. 
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Since  the  search  is  breadth-first  and  all  refinements 
are  immediately  reflected  in  the  values  for  the  candidate 
set,  refinement  can  be  terminated  at  any  time  and  the 
stylistic  function  called.  Thus  the  program  can  "play  with 
an  eye  on  the  clock”  rather  than  being  committed  to  a  fixed 


depth  of  search. 

An 

inter 

esting  aspect 

of  tournam 

ent 

competition  would 

be 

the 

development  of 

heur ist ics 

to 

allocate  the  available 

time. 

The  time  allowed 

for  a  gi 

ven 

move  could  depend 

on 

the 

time  remaining. 

opponent's  t 

ime 

remaining,  the  time 

spent  by 

the  opponent  on 

his  previ 

ous 

move,  number  of  candidates,  whether  the  opponent’s  move  was 
expected,  etc. 

By  means  of  its  position  library,  the  program  will  be 
continously  learning,  even  though  its  static  evaluation 
function  does  not  change.  Its  anomoly  detection  (step  8) 
will  tend  to  point  out  cases  where  the  evaluation  function 
itself  requires  improvement,  but  even  without  human 
intervention,  it  should  not  fall  into  the  same  trap  twice. 
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Appendix  2 

Details  of  Heuristics 

The  following  is  a  description  of  the  heuristics  used  in 
the  experiments  as  they  appeared  in  Valenti* s  thesis  (see 
Bibliography) .  The  scoring  polynomial  parameters  are  set  in 
the  program  to  be: 


p0  =  10 

p1  =  10 

p  2=  1 5 

p3=  1 0 

p  4  =  1 0 

p5  =  1  0 

p6  =  1  3 

p7=7 

p  8=7 

p9  =  5 

pi  0=1 5 

pi  1  =  15 

p 12=1 5 

p 1 3=  1 5 

p 1 4=1 5 

pi5=ir 

p  1  6=7 

p 17=1 0 

p  18  =  7 

p 1 9=1 0 

p20  =  1 5  p2 1  =  10 

The  "maximum  parameter  factor"  is  255.  The  development 
value  is  calculated  as  the  sum  of  the  values  of  all  the 
sguares  that  the  piece  can  go  to,  attack,  or  defend,  plus 
the  value  of  any  pieces  on  those  sguares  (friendly  or 
enemy) . 

Heuristic  0:  This  heuristic  checks  if  the  move  attacks  a 
passed  pawn  or  the  sguare  in  front  of  it.  If  it  attacks 
the  passed  pawn  directly,  then  credit  is  given  only  if 
the  attacker  is  on  the  same  column,  or  there  is  a  piece 
directly  in  front  of  the  passed  pawn,  blocking  it.  The 
relative  row  number  of  the  passed  pawn  times  parameter  0 
is  added  to  the  value  of  the  move. 
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Heuristic  1:  This  heuristic  checks  if  the  proposed  move 
attacks  a  piece  that  is  attacking  a  friendly  piece.  The 
value  of  the  attacking  piece  times  parameter  1  is  added 
to  the  value  of  the  move. 

Heuristic  2:  This  heuristic  determines  if  the  proposed  move 


gets  the  piece  away 

from 

an 

attack. 

If  so , 

the 

developmental  value 

of 

the 

attacked 

piece  ti 

mes 

parameter  2  is  added  to 

the  value 

of  the 

move.  If 

+  he 

proposed  move  is  from  a  damaging  attack  to  one  that  is 
not  damaging,  the  above  value  is  added.  If  the  attack 
was  not  damaging,  then  the  value  is  divided  by  8. 

Heuristic  3:  This  heuristic  determines  if  the  proposed  move 
exposes  any  piece  to  an  attack  or  defense  by  another 
piece  (friendly  or  enemy  pieces  attacking  or  attacked). 
Different  credit  is  given  depending  on  the  goodness  or 
the  badness  of  the  exposure.  For  instance,  exposed 
checks  rate  highly,  while  exposing  its  own  piece  to  a 
damaging  attack  rates  very  negative. 

Heuristic  4:  This  heuristic  checks  if  the  move  interposes  a 
piece  between  an  attacker  and  an  attacked  piece 
(friendly  piece) .  If  so,  the  developmental  value  of  the 
attacked  piece  times  parameter  4  is  added  to  the  value 
of  the  move  (but  only  if  the  interposition  is  safe). 
Half  credit  is  given  if  the  piece  was  not  under  a 
damaging  attack  in  the  first  place.  Pieces  that  were 
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Heuristic  6:  This  heuris 
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the  attacked  pieces  are  multiplied  by  parameter  6  and 
added  to  the  value  of  the  move.  If  it  attacks  an  enemy 
queen,  then  only  half  credit  is  given  (since  it  is 
usually  a  large  number) .  The  developmental  values  of  any 
pieces  that  it  was  attacking  are  multiplied  and 
subtracted.  If  the  attacked  pieces  are  defended  only 
1 /4th  credit  is  given.  If  more  than  one  piece  of  greater 
value  is  attacked  (forking  move),  then  the  difference 
between  the  minimum  of  the  least  value  attacked  and  the 
least  valued  attacker  on  the  sguare  moved  to,  and  the 
value  of  the  attacking  piece,  is  mul+iplied  by  parameter 
6  times  the  largest  parameter.  If  it  adds  extra 
attackers  on  a  piece,  then  the  new  total  number  of 
attackers  is  multiplied  by  the  credit  given  to  the  move 
and  added.  If  the  attacked  piece  can  move  to  several 
squares,  then  less  credit  is  given,  in  order  to 
discourage  chasing  pieces.  If  it  attacks  a  piece  that  is 
pinned,  the  maximum  parameter  factor  is  added  to  the 
value  of  the  move. 
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added  to  pawn  moves  to  the  centre  4  sguares.  It 
subtracts  16  times  parameter  7  if  the  move  is  p-q4  or  p- 
q5  with  only  a  queen  defending  (disallows  early 
development  of  the  queen).  In  the  end  game,  a  spatial 
developmental  value  is  calculated  for  the  kings  to 
encourage  attacking  enemy  pieces  with  the  king. 

Heuristic  8:  This  heuristic  determines  if  the  move  allows 
more  freedom  of  movement  for  knights  and  bishops  on  the 
same  side.  The  values  of  the  affected  pieces  are 
multiplied  by  parameter  8  and  added  to  the  value  of  the 
move.  If  a  move  blocks  these  pieces,  then  the  above 
value  is  subtracted  from  the  value  of  the  move. 

Heuristic  9:  This  heuristic  checks  if  the  proposed  move 
attacks  the  piece  last  moved  by  the  opponent.  If  so, 
then  half  the  developmental  value  of  the  piece  times 
parameter  9  is  added  to  the  value  of  the  move. 

Heuristic  10:  This  heuristic  determines  if  the  move  places  a 
rook  on  an  open  file,  if  it  wasn't  already  on  one.  Four 
times  parameter  10  is  added  to  the  value  of  the  move. 

Heuristic  11:  This  heuristic  determines  if  the  move  places  a 
rook  behind  one  of  its  own  passed  pawns,  if  it  wasn't 
already  behind  one.  Four  times  parameter  11  is  added  to 
th^  value  of  the  move. 
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Heuristic  12:  This  heuristic  sees  if  the  move  is  a  capturing 
move.  If  the  capture  is  the  start  of  an  exchange,  then 
parameter  12  times  the  difference  in  the  developmental 
values  of  the  pieces  in  the  exchange,  is  added  to  the 
value  of  the  move.  If  it  is  damaging  (not  counting  the 
piece  captured) ,  then  the  difference  in  their  values 
(which  may  be  positive  or  negative)  times  the  largest 
parameter  times  parameter  12  is  added  to  the  value  of 
the  move. 

Heuristic  13:  This  heuristic  checks  if  the  proposed  move  is 
a  checking  move,  castling  move  ,  en  passant  capture  or 
move  of  a  passed  pawn.  If  so,  and  the  piece  is  not  under 
a  damaging  attack,  then  the  largest  parameter  times 
parameter  13  is  added  to  the  value  of  the  move.  If  it 
is  under  any  attack,  then  only  half  value  is  given.  For 
castling,  the  pawn  structure  is  checked  first  to  see  if 
it  is  suitable. 

Heuristic  14:  This  heuristic  checks  for  pieces  that  may  have 
been  sacrificed  or  saved  by  this  move.  It  checks  for 
redundancy  with  heuristic  #12,  so  parameters  12  and  14 
should  be  the  same.  It  also  checks  for  pieces  under 
damaging  attacks  that  were  not  moved  or  saved.  If  there 
are  any,  the  largest  value  exposed  piece's  value  tim^s 
parameter  14  times  the  largest  parameter  is  subtracted 
from  the  value  of  the  move.  Similarly  all  pieces  saved 
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by  this  move  have  their  values  added  to  the  value  of  the 
move. 

Heuristic  15:  This  heuristic  checks  if  the  move  doubles  up 
pieces  that  were  rot  already  doubled  up  (queens  and 
bishops  or  rooks,  bishops  and  queens,  rooks  and  rooks  or 
queens).  If  sc,  then  8  times  parameter  15  is  added  to 
the  value  of  the  move.  No  credit  is  qiven  at  this  time 
for  doubling  up  rooks  and  queens  on  rows,  since  this  was 
found  to  be  detrimental  in  most  cases.  However  ,  this 
is  probably  more  useful  in  an  end  game  situation  and 
should  be  considered  in  that  case.  The  program  has  no 
specific  end  game  strategies  at  this  time,  other  than 
encouraging  the  kings  to  attack  more.  At  this  time,  the 
part  of  this  heuristic  that  checks  for  doubling  gueens 
and  bishops  has  been  disabled,  since  it  would  also 
encourage  these  at  times  when  it  was  undesirable.  A  more 
specific  strategy  is  required  to  take  advantage  of  this. 
If  pieces  are  doubled  up  by  moving  an  intervening  piece, 
this  heuristic  gives  no  credit,  but  heuristic  #3,  which 
gives  credit  for  uncovering  defence  of  pieces  will 
compensate  for  this. 

Heuristic  16:  This  heuristic  checks  if  the  move  now  attacks 
the  squares  around  the  enemy  king  (not  including  it) ,  if 
it  wasn’t  already  attacking  one.  If  so,  then  half  thp 
king's  value  times  parameter'  16  is  added  to  the  value  of 
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the  move.  Additional  credit  is  given  for  adding  a  second 
attacker  around  the  enemy  king. 

Heuristic  17:  This  heuristic  checks  if  the  to-sguare  pins  an 
enemy  piece,  or  is  pinned  by  an  enemy  piece.  The 
intervening  piece’s  colour  is  ignored.  The  value  of  the 
pinner  or  pinee  is  added  or  subtracted  (respectively) 
and  multiplied  by  parameter  17  and  added  to  the  value 
of  the  move. 

Heuristic  18:  This  heuristic  deals  with  pawn  moves.  It 

first  checks  if  the  pawn  moved  is  on  one  of  the  sguares 
in  front  of  the  castled  king,  and  discourages  it,  if  so. 
If  the  king  hasn’t  castled  and  hasn't  moved,  the  pawns 
in  front  of  the  castling  position  are  discouraged  from 


moving. 

If 

the  king 

has  cast 

led. 

the 

pawns 

on 

the 

knight 

and 

rook 

columns , 

less 

than 

row 

are 

discouraged 

from  movi 

ng.  If  the 

pawn 

is  on 

the 

QF  1 

or 

KF 1  position,  then  8  *  parameter  18  is  subtracted,  else 
16  *  parameter  8  is  subtracted.  In  the  end  game  ,  these 
restrictions  are  lifted.  If  the  move  blocks  the  progress 
of  an  enemy  pawn  that  has  already  moved,  then.  8  * 
parameter  18  is  added  to  the  value  of  the  move.  Credit 
is  also  given  to  discourage  doubling  and  encourage 
undoubling  of  pawns  in  a  capture.  This  heuristic  also 
has  the  added  ability  to  detect  various  pawn  and  pawn- 
bishop  formations.  Positive  credit  is  given  for  creating 
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these  structures,  but  no  negative  credit  is  given  for 
destroying  them. 

Heuristic  19:  This  heuristic  checks  if  the  move  blocks  a  pin 
on  a  friendly  piece  or  frees  a  pin  on  an  enemy  piece. 
The  value  of  the  blocked  piece  times  parameter  19  is 
added  to  the  value  of  the  move.  Similarly,  the  value  of 
the  freed  piece  times  parameter  19  is  subtracted  from 
the  value  of  the  move. 

Heuristic  20  :  This  heuristic  discourages  successive  moves 

with  the  same  piece.  One  fourth  of  the  maximum  parameter 
value  is  subtracted  from  +he  value  of  the  move.  This 
heuristic  is  done  only  in  the  non-look- ahead  positions. 

Heuristic  21:  this  heuristic  encourages  the  king  to  advance 
and  attack  in  the  endgame. 
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Computer  Output 


The  following  is  a  copy  of  the  actual  computer  printouts 
from  the  experiments.  The  first  set  consists  of  the  results 
from  the  ranking  experiments,  while  the  second  set  consists 
of  the  regression  results. 

Ranking  Output 

This  experiment  measured  how  often  the  master  move  was 
selected  by  CHUTE  to  be  among  the  top  7  possible  moves  for  a 
given  nosition.  This  measure  was  run  wi  +  h  several 
variation  s. 


For  each  run  of  the  program, 
SIMPLE  PERCENTAGE  and  CUMULATIVE 
the  first  row  of  numbers,  (i 
possible  rankings  of  moves.  In  th 
TOTALS  represents,  for  each  ra 
master  move  was  ranked  exactly  at 
rcw  represents  the  percentage 
TOTALS  value  accounts  for. 

In  the  case  of  cumulative 
particular  rank  represents  the  nu 
move  was  selected  at  +hat  ra 
better  than  a  rank  of  2,  etc.). 


we  get  two  sets  of  results: 
PERCENTAGE.  In  both  cases, 
to  7,  ABOVE) ,  represents 
e  case  of  simple  percent, 
nk ,  the  number  of  times  the 
that  rank.  The  PERCENTS 
of  all  positions  that  the 

percents,  the  TOTALS  for  a 
mber  of  cases  the  master 
nk  or  better  (  rank  of  1  is 
Again,  the  PERCENTS  row 


etc . ) 
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represents  the  percentage  of  cases  the  TOTALS  value  accounts 

f  or. 

The  characteristics  of  the  test  runs  were  as  follows: 

FIG  C1.1  This  was  over  the  whole  data  base  with  no  data  or 
heuristic  selection. 

FIG  Cl. 2  This  was  over  the  positions  in  the  data  base  that 
were  at  ply  less  than  16. 

FIG  Cl. 3  This  was  over  the  positions  that  were  between  (and 
including)  plies  16  to  60. 

FIG  Cl. 4  This  was  over  the  positions  that  were  at  ply 
greater  than  60. 

The  following  runs  were  run  on  a  test  file  of  3  games. 

FIG  C2. 1  This  was  run  with  the  full  heuristic  set  and  no 
weighting  changes. 

FIG  C 2 . 2  This  was  run  with  a  partial  heuristic  set  but  no 
weighting  changes. 

FIG  C2.3  This  was  run  with  full  heuristic  set  and  weightings 
derived  from  the  regression. 

FIG  C2.4  This  was  run  with  partial  heuristic  set  and  the 


derived  weightings. 
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Feqre ssion  Outputs 

The  following  resulted  from  regression  calculations 
using  SPSS  over  the  whole  data  base.  A  full  interpretation 
of  the  format  of  the  output  can  be  had  by  referring  to  the 
SPSS  manual.  Most  of  the  labelings  of  variables  are  quite 
self-explanatory.  The  variable  DSTANCE  is  the  1 /N  measure 
applied  to  each  position  in  the  file. 

The  interpretation  of  the  output  is  as  follows: 


FIG  R1.1  This  gives  ♦he  mean  and  standard  deviation  of  all 
variables  over  the  data  base. 

fig  PI. 2  This  is  a  partially  complete  correlation  matrix  of 
all  variables.  The  notable  data  are  in  the  first 
column,  where  we  see  the  correlation  of  all 
variables  to  DSTANCE. 

FIG  PI. 3  Stepwise  regression  summary  using  only  the  original 
heuristic  set. 


FIG  Pl.u  As  above,  with  the  calculated  material  measure 
MATRI  included  in  the  regression. 

FIG  PI. 5  Stepwise  regression  summary  including  the 
heuristics  and  calculated  material  measures  ADV02 


thru  ADV05 
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FIG  FI. 6  Selected  heuristics  which  had  a  positive 
correlation  to  DSTANCE. 

FIG  FI. 7  Fegression  using  only  the  calculated  material 
advantage  measures. 

FIG  F2.1  Fegression  using  the  full  heuristic  set  on  data 
with  ply  16  to  60 

FIG  F2.1.1  Mean  and  standard  deviation  of  variables  on 

data  with  ply  16  to  60 

FIG  F 2. 2  Fegression  using  the  full  heuristic  se  +  on  data 
with  ply  less  than  16 

FIG  F2.2.1  Mean  and  standard  deviation  of  variables  on 

data  with  ply  less  than  16 

FIG  F2.3  Fegression  using  the  full  heuristic  set  on  data 
with  ply  greater  than  60 

FIG  F2.3.1  Mean  and  standard  deviation  of  variables  on 

data  with  ply  greater  than  60 
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Uncertainty  Output 

The  following  output  was  derived  as  the  result  of  the 
uncertainty  experiment.  It  is  a  summary  table  of  regression 
analysis.  The  dependent  variable  was  R1SQ,  calculated  as: 

P1  =  1c*h7  +  1 945*adv9  2  +  (-10) *h2C  +  5*h6  + 

(-16)*h0  +  1 1 51*adv05  +  1047*adv03  +  39*h4  + 

(-9)*h17  +  (-4)*h5  +  (-5)  *h2  +  18  *h 1 0  + 

.  1*hl2  +  (-7)  *h9  +  (-.1)*h14  +  (-7)*h8  + 

46*h11  +  2  *h  3  +  6*h  1 6  +  59*h21  + 

1*h13  +  (-8)*h1  +  (-2)*h18  +  3*h15  + 

6  0  *ad  v04  +  (-.1)  *h19 

R 1  SO  =  (100  *  (DSTANCE  -  ((PI  *  .  0000  1)  +  .  0C230)  )2 
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Appendix  U 
The  Game  Library 

The  aame  library  of  98  games  is  currently  stored  on  the 
ATS  system  at  the  University  of  Toronto  under  the  .author's 
account.  There  are  10  documents  that  comprise  the  file, 
named  chess. libOl  through  chess. libio.  Each  of  these 
documents  contain  specific  36C  JCL  as  well  as  the  actual 
games.  Each  line  of  a  document  could  be  considered  as  a 
"card  imaae". 

The  organization  of  any  one  document  is  as  follows: 

1)  360  JCL  cards  for  the  entire  document 

2)  Up  to  10  groups,  each  consisting  of 

2.1)  an  EXEC  card  (JCL) 

2.2)  a  SYSIN  card  (JCL) 

2.3)  a  complete  game 

Each  game  consists  of 

1)  an  "id"  card  specifying  the  game  number  and  the 

players. 


2) 


move  pairs  describing  the  game 


— 15  5 — 


3 )  an  "outcome"  card  denoting  the  result  of  the  game, 

e.g.  BLACKFS  means  black  won  by  resignation. 
WHTTECM  means  white  won  by  checkmate  etc. 

The  next  page  contains  an  example  illustrating  the  file 
organization. 

The  purpose  of  this  type  of  organization  was  to 
facilitate  the  use  of  games  in  small  managable  chunks.  The 
JCL  serves  the  purpose  of  transferring  games  from  the  360  to 
the  370  system. 

Updating  and  maintenance  of  the  file  can  be  done  using 
*he  normal  ATS  editing  features. 

The  example  page  is  followed  by  a  listing  of  the  98  "id" 
cards  of  the  games  on  the  master  game  file. 
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//  9999 

,XXXX)  ,  »phys. la 

/♦route 

print  phys 

//  exec 

carddisk , dsn= ' 

//sysut 1 

dd  * 

1 32G . Stahlberg,  M. Monti 

P-Q4 

KT-KB3 

P-QB4 

P-K  3 

KT-QB  3 

P-Q4 

KT-B3 

QKT-Q2 

B-KT5 

P-B3 

P-K3 

Q-P  4 

P*P 

KT*P 

Q-Q2 

B-KT5 

B-B1 

P-QB4 

P-K4 

KT  ( Q4)  -B3 

B-Q3 

P*P 

KT*P 

0-0 

0-0 

KT-B4  ? 

E  *KT 

P*B 

P-QP3 

KT*B 

P*P 

KT* P  (KT5) 

N  (QB3 ) -KT5  KT-P3 

P-E3 

P-Q 1 

Q-P6 

KT-K2 

Q*P  (KB6) 

KT-KT3 

KT-B^ 

P*KT 

Q*P 

$ 

whiters 

//  exec 

carddisk , dsn= ' 

//sysut 1 

dd  * 

1 331. Kashdan ,  A.Kupchik 

P-Q4 

KT-KB3 

P-0B4 

P-K  3 

KT-QB3 

P-Q4 

Example  of  Library  Format 
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The  Games  in  the  library 

Games  101  to  150  are  from  [Mason60]  and  are  games  prior 
to  1937. 

Games  151  to  180  are  from  [Fine65]  and  are  circa  1945. 

Games  181  to  198  are  from  [Alekhine73]  and  are  from  the 
World  Chess  Championship  of  1937. 


10 1H. Pilnik,  W.W. Adams 
1 02C. Sclechter r  J. Mason 
103C. H. Alexander ,  H. V. Mallison 
104A. Anderssen ,  D.Harrwitz 
1 05R.  Spielmann,  A. Rubinstein 
1 06Dr. A. Alekhine ,  A. Rubinstein 
107R.Retir  P. Spielmann 
108Sir  G. A. Thomas,  A. Rubinstein 
1 09A. Kupchik ,  S. Factor 
1 1 01. A. Horowitz,  A. Martin 
1 1 1E. Rabinovich,  S.Flohr 
1 1 2L. Steiner ,  V. Petrov 
113V. Panov,  V.Chekhover 
1 1 4R. Spielmann ,  S. Landau 
115V. Panov,  Marsky 
116Korchmar,  Bonch-Osmolo vsky 
117j. Mieses,  A.Nimzovich 
1 1  8E.  E>. Boqolyubo v ,  G.  Maroczy 
1 1 9A. Lilienthal,  I. Bondarevsky 
120V.Rauzer,  V.Alatortsev 
121Dr. E. Lasker ,  Dr . S. Tart akover 
1  22W . Steinitz ,  A. Mongredien 
1 23F.  Peinf eld ,  A. E. Sant asier e 
1 24S. N. Bernstein ,  H. Morris 
125A. Speyer,  A. Rubinstein 
1 26A. Lilienthal ,  S. Landau 
1 27A.  Pubinste in,  G.Salwe 
1 28J. P. Capablanca ,  Allies 
129R.Reti,  Dr. E. Lasker 
1 30Dr • S. Tartakover ,  I.Kashdan 
131S.Flohr,  Sir  G.A.  Thomas 
1 32G. Stahlberg ,  M.Monticelli 
1 331. Kashdan,  A. Kupchik 
1 34A. Pubinstein ,  Dr. K.Treybai 
135E. Colie,  Sir  G.A. Thomas 
1 36A. Pubinstein,  A.Nimzovich 
137s.Flohr,  F. M. Jackson 
1 38P . Fine ,  P.P.Michell 
139A. Becker,  F. Glass 


1 UODr . M. Vidmar ,  D. Enoch 
1 41M.  Mont icelli,  M.Naidorf 
1 42V. Menchik,  Sir  G. A. Thomas 
1 43F. P. Michell ,  S.Flohr 
144V. A.Goglidze,  S.Flohr 
145Sokor,  S.Volck 
1 46D. Przepiorka,  J. Got t esdiener 
1  47Dr . M. Euwe,  J. Mieses 
1 48R. Feti ,  C, Torre 
1 49M. Eotvinnik ,  G.Levenfish 
1 50H. Kraoch ,  R. Spielmann 
151R.Finer  H. Steiner 
152F.Fine,  H. Helms 
1 53W. Shipman,  F. Fine 
1 54A. S. Denker ,  A.S.Pinkus 
1 55A. J. Fink,  S.Feshevsky 
1 56S. Feshevsky ,  H. Steiner 
1  571. a . Horowitz,  A. S. Denker 
158G. Kramer,  G.Drexel 
1 59G. Favinsky ,  V. Panov 
1 60V. Smyslov ,  A, Kotov 
161S.Flohr,  G. Favinsky 
162A.  Tolush,  A.. Kotov 
163A.Tolush,  M.Botvinnik 
1 64M. Eot vinnik ,  V. Smyslov 
1 65P. Romanovsky ,  M.Botvinnik 
1 66V. Fagosin ,  D.Bronstein 
1  67D.  Eronstein ,  G .  Loever.f  isch 
1  68A.  Alekhine  ,  A..  Pomar 
1 69S . Tartakower ,  M. Christof fel 
1 70M. Christoff el ,  H. Steiner 
1 7 1 g , S  tol t  z ,  M.Botvinnik 
1 72Dr . M. Fuwe ,  H. Steiner 
1 73M. Na jdorf ,  M.Botvinnik 
174M. Naidorf ,  G. Stahlberg 
1 75V. Smyslov,  S.Feshevsky 
1 761 . Bole sla vsky ,  F.Fine 
177F. Zita,  P.Bronstein 
178C. H . 0 ' D. Alexander,  M. Bot vinnik 
1 79V. Smyslov,  A. S. Denker 
1 80L. Steiner ,  C. Purdy 
1 81Dr . M. Fuwe,  Dr . A. Alekhine 
1 82Dr. A. Alekhine,  Dr. M . Eu we 
1 83Dr . M. Fuwe,  Dr . A. Alekhine 
1 84Dr. A. Alekhine ,  Dr. M. Fuwe 
1 85Dr. A. Alekhine,  Dr. M. Fuwe 
1 86Dr . M. Fuwe,  Dr • A. Alekhi ne 
1 87Dr. A. Alekhine,  Dr. M. Fuwe 
1 88Dr. A. Alekhine ,  Dr. M.Euwe 
1 89Dr . M. Fuwe ,  Dr. A. Alekhine 
1 90Dr . A. Alekhine ,  Dr. M. Fuwe 
1 9 IDr . M. Euwe ,  Dr . A. Alekhine 


1 92Dr • M.  Euwe ,  Dr. 
1 93Dr .  A.  Alekhine, 
1 9 4Dr. A. Alekhine  , 
1 95Dr • M. Euwe,  Dr. 
1 96Dr . A. Alekhine, 
1 97Dr . M. Euwe,  Dr. 
1 9 8Dr. A. Alekhine , 


A. Alekhine 
Dr. M, Euwe 
Dr.  M.  Euwe 
A. Alekhine 
Dr. M. Euwe 
A. Alekhine 
Dr. M. Euwe 


UNIVERSITY  OF  TORONTO 


COMPUTER  SYSTEMS  RESEARCH  GROUP 
bibliography  OF  CSRG  TECHNICAL  REPORTS* 

*  CSRG-1  EMPIRICAL  COMPARISON  OF  LR (k)  AND  PRECEDENCE  PARSERS 


J.J.  Horning  and  H. R.  Lalonde,  September  1970 
[ACM  SIGPLAN  Notices,  November  1970] 

CSRG- 

-2  AN  EFFICIENT  LALR  PARSER  GENERATOR 

W.R.  Lalonde,  February  1971 
[M.A.Sc.  Thesis,  EE  1971] 

*  CSRG- 

-3  A  PROCESSOR  GENERATOR  SYSTEM 

J.D.  Gorrie,  February  1971 
[M.A.Sc.  Thesis,  EE  1971] 

*  CSRG- 

-4  DYLAN  USER'S  MANUAL 

P.E.  Bonzon,  March  1971 

CSRG- 

•5  DIAL  *  A  PROGRAMMING  SYSTEM  FOR  INTERACTIVE  ALGEBRAIC 
MANIPULATION 

Alan  C.M.  Brown  and  J.J.  Horning,  March  1971 

*  CSRG- 

-6  ON  DEADLOCK  IN  COMPUTER  SYSTEMS 

Richard  C.  Holt,  April  1971 

[Ph.D.  Thesis,  Dept,  of  Computer  Science, 

Cornell  University,  1971] 

CSRG- 

-7  THE  STAR-RING  SYSTEM  OF  LOOSELY  COUPLED  DIGITAL  DEVICES 
John  Neill  Thomas  Potvin,  August  1971 
[M.A.Sc.  Thesis,  EE  1971] 

*  CSRG- 

-8  FILE  ORGANIZATION  AND  STRUCTURE 

G.M.  Stacey,  August  1971 

CSRG- 

-9  DESIGN  STUDY  FOR  A  TWO-DIMENSIONAL  COMPUTER-ASS ISTED 
ANIMATION  SYSTEM 

Kenneth  B.  Evans,  January  1972 
[M.Sc.  Thesis,  DCS  1972  ] 

*  CSRG- 

*10  HOW  A  PROGRAMMING  LANGUAGE  IS  USED 

William  Gregg  Alexander,  February  1972 
[M.Sc.  Thesis,  DCS  1971] 

CSRG- 

-11  PROJECT  SUE  STATUS  REPORT 

J.W.  Atwood  (ed.),  April  1972 

CSRG- 

-12  THREE  DIMENSIONAL  DATA  DISPLAY  WITH  HIDDEN  LINE  REMOVAL 
Rupert  Bramall,  April  1972 
[M.Sc.  Thesis,  DCS  1971] 

*  Abbreviations: 

DCS  -  Department  of  Computer  Science,  University  of  Toronto 
EE  -  Department  of  Electrical  Engineering,  University  of 
Toronto 

*  -  Out  of  print 


CSRG- 

-13  A  SYNTAX  DIRECTED  ERROR  RECOVERY  METHOD 
lewis  R.  James,  May  1972 
[  M.  Sc.  Thesis,  DCS  1972  ] 

CSRG- 

•14  THE  USE  OF  SERVICE  TIME  DISTRIBUTIONS  IN  SCHEDULING 
Kenneth  C.  Sevcik,  May  1972 

[Ph.D.  Thesis,  Committee  on  Information  Sciences, 
University  of  Chicago,  1971;  JACM,  January  1974] 

CSRG- 

•15  PROCESS  STRUCTURING 

J.J.  Horning  and  B.  Randell,  June  1972 
[ACM  Computing  Surveys,  March  1973] 

CSRG- 

•16  OPTIMAL  PROCESSOR  SCHEDULING  WHEN  SERVICE  TIMES  ARE 
HYPEREXPONENTIALLY  DISTRIBUTED  AND  PREEMTION  OVERHEAD 

IS  NOT  NEGLIGIBLE 

Kenneth  C.  Sevcik,  June  1972 

[Proceedings  of  the  Symposium  on  Computer-Communication, 
Networks  and  Teletraffic, 

Polytechnic  Institute  of  Brooklyn,  1972] 

CSRG- 

•17  PROGRAMMING  LANGUAGE  TRANSLATION  TECHNIQUES 

W. M.  McKeeman,  July  1972 

CSRG- 

•18  A  COMPARATIVE  ANALYSIS  OF  SEVERAL  DISK  SCHEDULING 
ALGORITHMS 

C.J.M.  Turnbull,  September  1972 

CSRG- 

•19  PROJECT  SUE  AS  A  LEARNING  EXPERIENCE 

K.C.  Sevcik  et  al,  September  1972 

[Proceedings  AFIPS  Fall  Joint  Computer  Conference, 
v.  41,  December  1972] 

CSRG- 

-20  A  STUDY  OF  LANGUAGE  DIRECTED  COMPUTER  DESIGN 

David  B.  Wortman,  December  1972 

[Ph.D.  Thesis,  Computer  Science  Department, 

Stanford  University,  1972] 

CSRG- 

-21  AN  APL  TERMINAL  APPROACH  TO  COMPUTER  MAPPING 

R.  Kvaternik,  December  1972 
[M.Sc.  Thesis,  DCS  1972  ] 

*  CSRG- 

-22  AN  IMPLEMENTATION  LANGUAGE  FOR  MINICOMPUTERS 

G.G.  Kalmar,  January  1973 
[M.Sc.  Thesis,  DCS  1972  ] 

CSRG- 

-23  COMPILER  STRUCTURE 

W. M.  McKeeman,  January  1973 

[Proceedings  of  the  USA-Japan  Computer  Conference,  1972] 

*  CSRG-24  AN  ANNOTATED  BIELIO  GRAP  HY  ON  COMPUTER  PROGRAM 
ENGINEERING 

J.D.  Gannon  (ed.),  March  1973 


CSRG-25  THE  INVESTIGATION  OF  SERVICE  TIME  DISTRIBUTIONS 
Eleanor  A.  Lester,  April  1973 
[M.Sc.  Thesis,  DCS  1973  ] 

*  CSRG-26  PSYCHOLOGICAL  COMPLEXITY  OF  COMPUTER  PROGRAMS: 

AN  INITIAL  EXPERIMENT 
Larry  Weissman,  August  1973 

*  CSRG-27  STRUCTURED  SUBSETS  OF  THE  PL/I  LANGUAGE 

Richard  C.  Holt  and  David  B,  Wortman,  October  1973 

*  CSRG-28  ON  THE  REDUCED  MATRIX  REPRESENTATION  OF  LR (k) 

PARSER  TABLES 

Marc  Louis  Joliat,  October  1973 
[Ph.D.  Thesis,  EE  1973] 

*  CSRG-29  A  STUDENT  PROJECT  FOR  AN  OPERATING  SYSTEMS  COURSE 

B.  Czarnik  and  D,  Tsichritzis  (eds.),  November  1973 

*  CSRG-30  A  PSEUDO-MACHINE  FOR  CODE  GENERATION 

Henry  John  Pasko,  December  1973 
[M.Sc.  Thesis,  DCS  1973] 

*  CSRG-31  AN  ANNOTATED  BIELIOGRAPHY  ON  COMPUTER  PROGRAM 

ENGINEERING 

J.D.  Gannon  (ed.).  Second  Edition,  March  1974 

CSBG-32  SCHEDULING  MULTIPLE  RESOURCE  COMPUTER  SYSTEMS 

E. D.  Lazowska,  May  1974 
[M.Sc.  Thesis,  DCS  1974] 

*  CSRG-33  AN  EDUCATIONAL  DATA  BASE  MANAGEMENT  SYSTEM 

F.  Lochovsky  and  D.  Tsichritzis,  May  1974 

*  CSRG-34  ALLOCATING  STORAGE  IN  HIERARCHICAL  DATA  BASES 

P.  Bernstein  and  D.  Tsichritzis,  May  1974 

*  CSRG-35  ON  IMPLEMENTATION  OF  RELATIONS 

D.  Tsichritzis,  May  1974 

*  CSRG-36  SIX  PL/I  COMPILERS 

D.B.  Wortman,  P.J.  Khaiat,  and  D.M.  Lasker 
August  1974 

CSRG-37  A  METHODOLOGY  FOR  STUDYING  THE  PSYCHOLOGICAL  COMPLEXITY 
CF  COMPUTER  PROGRAMS 
Laurence  M.  Weissman,  August  1974 
[Ph.D.  Thesis,  DCS  1974] 

*  CSRG-38  AN  INVESTIGATION  OF  A  NEW  METHOD  OF  CONSTRUCTING 

SOFTWARE 

David  M.  Lasker,  September  1974 
[M.Sc.  Thesis,  DCS  1974] 

CSRG-39  AN  ALGEBRAIC  MODEL  FOR  STRING  PATTERNS 
Glenn  F.  Stewart,  September  1974 
[M.Sc.  Thesis,  DCS,  1974] 


CSRG-40  EDUCATIONAL  DATA  BASE  SYSTEM  USER'S  MANUAL 
J.  Klebanoff,  F.  Lochovsky,  A.  Rozitis,  and 

D.  Tsichritzis,  September  1974 

CSRG-41  NOTES  FROM  A  WORKSHOP  ON  THE  ATTAINMENT  OF 
RELIABLE  SOFTWARE 

David  B.  Wortman  (ed.),  September  1974 

>  CSRG-42  THE  PROJECT  SUE  SYSTEM  LANGUAGE  REFERENCE  MANUAL 

E. L.  Clark  and  F.J.B.  Ham,  September  1974 

CSRG-43  A  DATA  BASE  PROCESSOR 

E.A.  Ozkarahan,  S.A.  Schuster  and  K.C.  Smith, 
November  1974 


CSRG-44  MATCHING  PROGRAM  AND  DATA  REPRESENTATION  TO  A 
COMPUTING  ENVIRONMENT 
Eric  C.  R.  Hehner,  November  1974 
[Ph.D.  Thesis,  DCS,  1974] 


CSRG-45  THREE  APPROACHES  TO  RELIABLE  SOFTWARE;  LANGUAGE 

DESIGN,  DYADIC  SPECIFICATION,  COMPLEMENTARY  SEMANTICS 
J.E.  Donahue,  J.D.  Gannon,  J.V.  Guttag  and 
J.J.  Horning,  December  1974 

CSRG-46  THE  SYNTHESIS  OF  OPTIMAL  DECISION  TREES  FROM 
DECISION  TABLES 

Helmut  Schumacher,  December  1974 
[ M. Sc.  Thesis,  DCS,  1974] 


CSRG-47  LANGUAGE  DESIGN  TO  ENHANCE  PROGRAMMING  RELIABILITY 
John  D.  Gannon,  January  1975 
[Ph.D.  Thesis,  DCS,  1975] 

CSRG-48  DETERMINISTIC  LEFT  TO  RIGHT  PARSING 

Christopher  J.M.  Turnbull,  January  1975 
[Ph.D.  Thesis,  EE,  1974] 

CSRG-49  A  NETWOPK  FRAMEWORK  FOR  RELATIONAL  IMPLEMENTATION 
D.  Tsichritzis,  February  1975 

*  CSRG-50  A  UNIFIED  APPROACH  TO  FUNCTIONAL  DEPENDENCIES 
AND  RELATIONS 

P. A.  Bernstein,  J.R.  Swenson  and  D.C.  Tsichritzis 
February  1975 


►  CSRG-51  ZETA:  A  PROTOTYPE  RELATIONAL  DATA  BASE 
MANAGEMENT  SYSTEM 
M.  Brodie  (ed) •  February  1975 

CSRG-52  AUTOMATIC  GENERATION  OF  SYNTAX-REPAIRING  AND 
PARAGRAPHING  PARSERS 
David  T.  Barnard,  March  1975 
[M.  Sc.  Thesis,  DCS,  1975] 

CSRG-53  QUERY  EXECUTION  AND  INDEX  SELECTION  FOR  RELATIONAL 
DATA  BASES 

J.H.  Gilles  Farley  and  Stewart  A.  Schuster,  March  1975 


CSRG-54  AN  ANNOTATED  BIBLIOGRAPHY  ON  COMPUTER 
PROGRAM  ENGINEERING 

J.V.  Guttag  (ed.) ,  Third  Edition,  April  1975 

CSRG-55  STRUCTURED  SUBSETS  OF  THE  PL/1  LANGUAGE 

Richard  C.  Holt  and  David  B.  Wortman,  May  1975 

CSRG-56  FEATURES  OF  A  CONCEPTUAL  SCHEMA 
D.  Tsichritzis,  June  1975 

CSRG-57  MERLIN:  TOWARDS  AN  IDEAL  PROGRAMMING  LANGUAGE 
Eric  C, R,  Hehner,  July  1975 

CSRG-58  ON  THE  SEMANTICS  OF  THE  RELATIONAL  DATA  MODEL 
Hans  Albert  Schmid  and  J.  Richard  Swenson, 

July  1975  [Proceedings  cf  the  ACM  SIGMOD 
Conference,  1975] 

CSRG-5 9  THE  SPECIFICATION  AND  APPLICATION  TO 
PROGRAMMING  OF  ABSTRACT  DATA  TYPES 
John  V.  Guttag,  September  1975 
[Ph.D.  Thesis,  DCS,  1975] 

CSRG-60  NORMALIZATION  AND  FUNCTIONAL  DEPENDENCIES  IN  THE 
RELATIONAL  DATA  EASE  MODEL 
Phillip  Alan  Bernstein,  October  1975 
[Ph.D.  Thesis,  DCS,  1975] 

CSRG-61  LSL:  A  LINK  AND  SELECTION  LANGUAGE 
D.  Tsichritzis,  November  1975 

CSP.G-62  COMPLEMENTARY  DEFINITIONS  OF  PROGRAMMING 
LANGUAGE  SEMANTICS 
James  E.  Donahue,  November  1975 
[Ph.D.  Thesis,  DCS,  1975] 

CSRG-63  AN  EXPERIMENTAL  EVALUATION  OF  CHESS  PLAYING 
HEURISTICS 

Lazio  Sugar,  December  1975 
[ M. Sc.  Thesis,  DCS,  1975] 


